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Preface 



OVERVIEW 

Human computer interaction (HCI) evolved as a recognized discipline that attracts innovation and creativity. 
For the last 25 years, it inspired new solutions, especially for the benefit of the user as a human being, making 
the user the focal point that technology should serve rather than the other way around. The advent of the 
Internet, combined with the rapidly falling prices and increasing capability of personal computers, among 
other things, made the 1990s a period of very rapid change in technology. This has major implications on HCI 
research and advances, where peoples’ demands and expectations as users of technology increased. 

There is currently no agreement upon definition of the range of topics which form the area of human- 
computer interaction. Based on the definition given by the ACM Special Interest Group on Computer-Human 
Interaction Curriculum Development, which is also repeated in most HCI literature, the following is 
considered as an acceptable definition: 

Human-computer interaction is a discipline concerned with the design, evaluation and implementa- 
tion of interactive computing systems for human use in a social context, and with the study of major 
phenomena surrounding them. 

A significant number of major corporations and academic institutions now study HCI. Many computer 
users today would argue that computer makers are still not paying enough attention to making their products 
“usable”. HCI is undoubtedly a multi-disciplinary subject, which draws on disciplines such as psychology, 
cognitive science, ergonomics, sociology, engineering, business, graphic design, technical writing, and, most 
importantly, computer science and system design/software engineering. 

As a discipline, HCI is relatively young. Throughout the history of civilization, technological innovations 
were motivated by fundamental human aspirations and by problems arising from human-computer interac- 
tions. Design, usability and interaction are recognised as the core issues in HCI. 

Today, profound changes are taking place that touch all aspects of our society: changes in work, home, 
business, communication, science, technology, and engineering. These changes, as they involve humans, 
cannot but influence the future of HCI since they relate to how people interact with technology in an 
increasingly dynamic and complex world. This makes it even more essential for HCI to play a vital role in 
shaping the future. 

Therefore, preparing an encyclopedia of HCI that can contribute to the further development of science 
and its applications, requires not only providing basic information on this subject, but also tackling problems 
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that involve HCI issues in a wider sense, for example, by addressing HCI in and for various applications, that 
is, e-learning, health informatics, and many others. 



CHALLENGES, CONTENT AND ORGANISATION 

The following are some challenges in the HCI field, which were taken into consideration when compiling this 
encyclopedia: 

• HCI is continually evolving with the fast change in technology and its cost. We, therefore, covered basic 
concepts/issues and also new advances in the field. 

• The need to strike a balance between covering theory, methods/models, applications, experiences, and 
research. The balance was sought to provide a rich scientific and technical resource from different 
perspectives. 

• The most important purpose of an encyclopedia in a particular discipline is to be a basic reference work 
for readers who need information on subjects in which they are not experts. The implication of “basic” 
is that an encyclopedia, while it should attempt to be comprehensive in breadth of coverage, cannot 
be comprehensive in the depth with which it treats most topics. What constitutes breadth of coverage 
is always a difficult question, and it is especially so for HCI, a relatively new discipline that has evolved 
over the past three decades and is still changing rapidly. 

• An encyclopedia should, however, direct the reader to information at a deeper level, as this 
encyclopedia does through bibliographic references, indexed keywords, and so forth. 

• This encyclopedia differs from other similar related references in that it covers core HCI topics/issues 
(that we see in most standard HCI books) as well as the use of HCI in various applications and recent 
advances and research. Thus the choice of specific topics for this encyclopedia has required our 
judgment of what is important. While there may be disagreement about the inclusion or exclusion of 
certain topics, we hope and believe that this selection is useful to a wide spectrum of readers. There 
are numerous articles that integrate the subject matter and put it into perspective. Overall, the 
encyclopedia is a general reference to HCI, its applications, and directions. 

In order to meet these challenges, we invited professionals and researchers from many relevant fields 
and expertise to contribute. The resulting articles that appear in this volume were selected through a double- 
blind review process followed by rounds of revision prior to acceptance. Treatment of certain topics is not 
exclusive according to a given school or approach, and you will find a number of topics tackled from different 
perspectives with differing approaches. A field as dynamic as HCI will benefit from discussions, different 
opinions, and, wherever possible, a consensus. 

An encyclopedia traditionally presents definitive articles that describe well-established and accepted 
concepts or events. While we have avoided the speculative extreme, this volume includes a number of entries 
that may be closer to the “experimental” end of the spectrum than the “well-established” end. The need to 
do so is driven by the dynamics of the discipline and the desire, not only to include the established, but also 
to provide a resource for those who are pursuing the experimental. Each author has provided a list of key 
terms and definitions deemed essential to the topic of his or her article. Rather than aggregate and filter these 
terms to produce a single “encyclopedic” definition, we have preferred instead to let the authors stand by 
their definition and allow each reader to interpret and understand each article according to the specific 
terminology used by its author(s). 

Physically, the articles are printed in alphabetical order by their titles. This decision was made based on 
the overall requirements of Idea Group Reference’s complete series of reference encyclopedias. The 
articles are varied, covering the following main themes: 1) Foundation (e.g., human, computer, interaction, 



XVI 



paradigms); 2) Design Process (e.g., design basics, design rules and guidelines, HCI in software develop- 
ment, implementation, evaluation, accessible design, user support); 3) Theories (e.g., cognitive models, social 
context and organisation, collaboration and group work, communication); 4) Analysis (e.g., task analysis, 
dialogue/interaction specification, modelling); and 5) HCI in various applications (e.g., e-learning, health 
informatics, multimedia, Web technology, ubiquitous computing, mobile computing). 

This encyclopedia serves to inform practitioners, educators, students, researchers, and all who have an 
interest in the HCI field. Also, it is a useful resource for those not directly involved with HCI, but who want 
to understand some aspects of HCI in the domain they work in, for the benefit of “users”. It may be used 
as a general reference, research reference, and also to support courses in education (undergraduate or 
postgraduate). 



CONCLUSION 

Human computer interaction will continue to strongly influence technology and its use in our every day life. 
In order to help develop more “usable” technology that is “human/user-centred”, we need to understand what 
HCI can offer on these fronts: theoretical, procedural, social, managerial, and technical. 

The process of editing this encyclopedia and the interaction with international scholars have been most 
enjoyable. This book is truly an international endeavour. It includes 109 entries and contributions by 
internationally-talented authors from around the world, who brought invaluable insights, experiences, and 
expertise, with varied and most interesting cultural perspectives in HCI and its related disciplines. 

It is my sincere hope that this volume serves not only as a reference to HCI professionals and researchers, 
but also as a resource for those working in various fields, where HCI can make significant contributions and 
improvements. 

Claude Ghaoui 

Liverpool John Moores University, UK 
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INTRODUCTION 

According to Raskin (2000), the way we interact 
with a product, what we do, and how it responds are 
what define an interface. This is a good starting 
definition in one important respect: an interface is 
not something given or an entirely predefined prop- 
erty, but it is the dynamic interaction that actually 
takes place when a product meets the users. More 
precisely, an interface is that interaction that medi- 
ates the relation between the user and a tool explain- 
ing which approach is necessary to exploit its func- 
tions. Hence, an interface can be considered a 
mediating structure. 

A useful exemplification of a mediating structure 
is provided by the so-called stigmergy. Looking at 
the animal-animal interactions, Raskin (2000) noted 
that termites were able to put up their collective nest, 
even if they did not seem to collaborate or commu- 
nicate with each other. The explanation provided by 
Grasse (Susi et al., 2001) is that termites do interact 
with each other, even if their interactions are medi- 
ated through the environment. According to the 
stigmergy theory, each termite acts upon the work 
environment, changing it in a certain way. The 
environment physically encodes and stores the 
change made upon it so that every change becomes 
a clue that affects a certain reaction from it. Analo- 
gously, we might claim that an interface mediates 
the relation between the user and a tool affording 
him or her to use it a certain way 1 . Understanding the 
kind of mediation involved can be fruitfully investi- 
gated from an epistemological point of view. More 
precisely, we claim that the process of mediating can 
be understood better when it is considered to be an 
inferential one. 



BACKGROUND 

Several researchers (Kirsh, 2004; Hollan et al., 
2000) recently have pointed out that designing inter- 
face deals with displaying as many clues as possible 
from which the user can infer correctly and quickly 
what to do next. However, although the inferential 
nature of such interactions is acknowledged, as yet, 
no model has been designed that takes it into ac- 
count. For instance, Shneiderman (2002) has sug- 
gested that the value of an interface should be 
measured in terms of its consistency, predictability, 
and controllability. To some extent, these are all 
epistemological values. In which sense could an 
interaction be predictable or consistent? How can 
understanding the inferential nature of human-com- 
puter interaction shed light on the activity of design- 
ing good interfaces? Here, the epistemological task 
required is twofold: first, investigating what kind of 
inference is involved in such an interaction; and 
second, explaining how the analysis of the nature of 
computer interaction as inferential can provide use- 
ful hints about how to design and evaluate infer- 
ences. 

Regarding both of these issues, in both cases we 
shall refer to the concept of abduction as a keystone 
of an epistemological model. 

THE ROLE OF ABDUCTION IN 
DESIGNING INTERFACES 

More than one hundred years ago, Charles Sanders 
Peirce (1923) pointed out that human performances 
are inferential and mediated by signs. Here, signs 
can be icons or indexes but also conceptions, images, 
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and feelings. Analogously to the case of stigmergy, 
we have signs or clues that can be icons but also 
symbols and written words from which certain con- 
clusions are inferred. 

According to Peirce (1972), all those perfor- 
mances that involve sign activities are abductions. 
More precisely, abduction is that explanatory pro- 
cess of inferring certain facts and/or hypotheses that 
explain or discover some phenomenon or observa- 
tion (Magnani, 2001). Abductions that solve the 
problem at hand are considered inferences to the 
best explanation. Consider, for example, the method 
of inquiring employed by detectives (Eco & Sebeok, 
1991). In this case, we do not have direct experience 
of what we are taking about. Say, we did not see the 
murderer killing the victim, but we infer that given 
certain signs or clues, a given fact must have hap- 
pened. Analogously, we argue that the mediation 
activity brought about by an interface is the same as 
that employed by detectives. Designers that want to 
make their interface more comprehensible must 
uncover evidence and clues from which the user is 
prompted to infer correctly the way a detective 
does; this kind of inference could be called infer- 
ence to the best interaction. 

We can conclude that how good an interface is 
depends on how easily we can draw the correct 
inference. A detective easily can discover the mur- 
derer, if the murderer has left evidence (clues) from 
which the detective can infer that that person and 
only that person could be guilty. Moreover, that an 
inference could be performed easily and success- 
fully also depends upon how quickly one can do that. 
Sometimes, finding the murderer is very difficult. It 
may require a great effort. Therefore, we argue that 
how quick the process is depends on whether it is 
performed without an excessive amount of process- 
ing. If clues are clear and well displayed, the infer- 
ence is drawn promptly. As Krug (2000) put it, it 
does not have to make us think. 

In order to clarify this point even more, let us 
introduce the important distinction between theoreti- 
cal and manipulative abduction (Magnani, 2001). 
The distinction provides an interesting account to 
explain how inferences that exploit the environment 
visually and spatially, for instance, provide a quicker 
and more efficient response. Sentential and manipu- 
lative abductions mainly differ regarding whether 
the exploitation of environment is or is not crucial to 



carrying out reasoning. Sentential abduction mostly 
refers to a verbal dimension of abductive inference, 
where signs and clues are expressed in sentences or 
in explicit statements. This kind of abduction has 
been applied extensively in logic programming (Flach 
& Kakas, 2000) and in artificial intelligence, in 
general (Thagard, 1988). 

In contrast, manipulative abduction occurs when 
the process of inferring mostly leans on and is driven 
by the environment. Here, signs are diagrams, kines- 
thetic schemas, decorated texts, images, spatial 
representations, and even feelings. In all those ex- 
amples, the environment embodies clues that trigger 
an abductive process, helping to unearth information 
that otherwise would have remained invisible. Here, 
the exploitation of the environment comes about 
quickly, because it is performed almost tacitly and 
implicitly. According to that, many cases have dem- 
onstrated that problem-solving activities that use 
visual and spatial representation, for instance, are 
quicker and more efficient than sentential ones. We 
can conclude that, in devising interfaces, designers 
have to deal mostly with the latter type of abduction. 
Interfaces that lean on the environment are tacit and 
implicit and, for this reason, much quicker than 
sentential ones. 

Investigating the activity of designing interfaces 
from the abductive epistemological perspective de- 
scribed previously helps designers in another impor- 
tant respect: how to mimic the physical world within 
a digital one to enhance understanding. 

As we have seen previously, the environment 
enables us to trigger inferential processes. But it can 
do that if and only if it can embody and encode those 
signs from which one can infer what to do next. For 
example, if you are working in your office and would 
appreciate a visit from one of your colleagues, you 
can just keep the door open. Otherwise, you can 
keep it closed. In both cases, the environment en- 
codes the clue (the door kept open or closed), from 
which your colleagues can infer whether you do or 
don’t want to be disturbed. Here are the questions 
we immediately come up: How can we encode those 
signs in a digital world? How can we enrich it so as 
to render it capable of embodying and encoding 
clues? 

The question of how to enrich the digital world 
mainly concerns how to mimic some important fea- 
tures of the physical world in the digital one. Often, 
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common people refer to an interface as easy-to-use, 
because it is more intuitive. Therefore, we don’t need 
to learn how the product actually works. We just 
analogically infer the actions we have to perform 
from ordinary ones. More generally, metaphors are 
important in interface design, because they relate 
digital objects to the objects in the physical world, 
with which the user is more familiar. 2 

In the history of computer interface, many at- 
tempts have been made to replace some physical 
features in the digital one. For instance, replacing 
command-driven modes with windows was one of 
the most important insights in the history of technol- 
ogy and human-computer interaction (Johnson, 1997). 
It enabled users to think spatially, say, in terms of 
“where is what I am looking for?” and not in terms of 
“Wat sequence of letters do I type to call up this 
document?” 

Enriching the digital world deals to some extent 
with faking, transforming those features embedded 
in the physical world into illusions. For example, 
consider the rule of projection first invented by 
Filippo Brunelleschi and then developed by such 
painters as Feon Battista Alberti and Feonardo da 
Vinci. In Peircean terms, what these great painters 
did was to scatter those signs to create the illusion of 
three-dimensional representations. It was a trick that 
exploited the inferential nature of visual construction 
(Floffman, 1998). 3 

Now, the question is, how could we exploit infer- 
ential visual dimensions to enhance the interaction in 
the digital world? In the window metaphor, we do not 
have rooms, edges, or folders, such as in the physical 
world. They are illusions, and they are all produced 
by an inferential (abductive) activity of human per- 
ception analogously to what happens in smashing 
three to two dimensions. Flere, we aim at showing 
how visual as well as spatial, temporal, and even 
emotional abductive dimensions can be implemented 
fruitfully in an interface. Roughly speaking, we argue 



that enriching the digital world precisely means 
scattering clues and signs that in some extent fakes 
spatial, visual, emotional, and other dimensions, 
even if that just happens within a flat environment. 

We argued that the nature of signs can be verbal 
and symbolic as well as visual, spatial, temporal, and 
emotional. In correspondence with these last cases, 
one can recognize three abductive dimensions: vi- 
sual, spatial, and emotional abduction. 4 We will 
discuss each of them in detail, providing examples 
taken from Web designs. 

Abductive Inference 

Visual dimension is certainly one of the most ubiq- 
uitous features in Web interaction. Users mainly 
interact with Web pages visually (Kirsh, 2004; 
Shaik et al., 2004). Flere, signs and clues are colors, 
text size, dotted lines, text format (e.g., bold, under- 
line, italics); they convey visual representations and 
can assign weight and importance to some specific 
part. Consider, for example, the navigation menu in 
Figure 1. 

Flere, colors, capital letters, and text size provide 
visual clues capable of enhancing the processing of 
information. The attention immediately is drawn to 
the menu header that represents its content (con- 
ference and research); capital letters and colors 
serve this function. 

Then, the dotted list of the same color of the 
menu header informs the user about the number of 
the items. Flence, the fact that items are not marked 
visibly as menu headers gives a useful overview 
(Figure 2). Once the user has chosen what to see 
(conference or research), he or she can proceed to 
check each item according to his or her preference 
(Figure 3). 

In this example, the user is guided to draw the 
correct inference; it enables the user to understand 
what he or she could consult. 



Figure 1. 
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Figure 2. Figure 3. 
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Figure 6. 



Lbash-2.05b$ rm note.txt 



In contrast, consider, for example, the same 
content represented in Figure 4. 

In this case, even if the content is identical, the 
user does not have any visual clue to understand 
what he or she is going to consult. The user should 
read all the items to infer and, hence, to understand, 
that he or she could know something about past and 
future conferences and about the research topics. If 
one stopped the user’s reading after the third item 
(MBR04), the user could not infer that this page 
deals also with philosophy of science, with episte- 
mology, and so forth. The user doesn’t have enough 
clues to infer that. In contrast, in the first example, 
the user is informed immediately that this Web site 
contains information about conferences and re- 
search. 



Figure 5. 




Spatial Abductive Inference 

As already mentioned, the windows metaphor is 
certainly one of the most important insights in the 
history of interface technology. This is due to the 
fact that, as Johnson maintains, it enables the user to 
think in terms of “where is what I am looking for?” 
and not in terms of “what sequence of letters do I 
type to call up this document?” as in a command line 
system (Johnson, 1997). The computer becomes a 
space where one can move through just double- 
clicking on folders or icons, or dragging them. The 
difference is described well in Figure 5 and Figure 6. 

In Figure 5, the file named note.txt is deleted by 
dragging it to the bin (i.e., the task of deleting is 
accomplished by a movement analogous to that used 
in the physical setting). Whereas, in Figure 6, the 
task is carried out by typing a command line com- 
posed by the command itself {rm, which stands for 
remove) and the file to be deleted (note.txt). 

In designing Web pages, spatial dimension can be 
mimicked in other ways. One of the most well known 
examples is represented by the so-called tab. Tabs 
usually are employed in the real world to keep track 
of something important, to divide whatever they 
stick out of into a section, or to make it easy to open 
(Krug, 2000). In a Web site, tabs turn out to be very 
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Figure 7. 




important navigation clues. Browsing a Web site, 
users often find themselves lost. This happens espe- 
cially when the Web pages they are consulting do not 
provide spatial clues from which the user can easily 
infer where he or she is. For instance, several Web 
sites change their layout almost on every page; even 
if provided with a navigation menu, they are not 
helpful at all. In contrast, tabs enhance spatial 
inference in one important respect. 

Consider the navigation bar represented in Fig- 
ure 7. 

In this example, when the user is visiting a certain 
page (e.g., a homepage) (Figure 7), the correspon- 
dent tab in the navigation bar becomes the same 
color of the body page. As Krug (2000) noted, this 
creates the illusion that the active tab actually moves 
to the front. Therefore, the user immediately can 
infer where he or she is by exploiting spatial relations 
in terms of background-foreground. 

Emotional Abductive Inference 

Recently, several researchers have argued that 
emotion could be very important to improve usability 
(Lavie & Tractinsky , 2004; Norman, 2004; Schaik & 
Ling, 2004). The main issue in the debate is how 
could emotionally evocative pages help the user to 
enhance understanding? 

Here, abduction once again may provide a useful 
framework to tackle this kind of question. As Peirce 



( 1923) put it, emotion is the same thing as a hypothetic 
inference. For instance, when we look at a painting, 
the organization of the elements in colors, symme- 
tries, and content are all clues that trigger a certain 
reaction. 

Consider, for example, the way a computer pro- 
gram responds to the user when a forbidden opera- 
tion is trying to be performed. An alert message 
suddenly appears, often coupled with an unpleasant 
sound (Figure 8). 

In this case, the response of the system provides 
clues (sounds and vivid colors such as red or yellow) 
from which we can attribute a certain state to the 
computer (being upset) and, hence, quickly react to 
it. Moreover, engaging an emotional response ren- 
ders that reaction instantaneous; before reading the 
message the user already knows that the operation 
requested cannot proceed. Thus, a more careful 
path can be devised. Exploiting emotional reactions 
also can be fruitful in another respect. It conveys a 
larger amount of information. For instance, univer- 
sity Web sites usually place a picture of some 
students engaged in social activity on their homepage. 
This does not provide direct information about the 
courses. However, this triggers a positive reaction in 
connection with the university whose site is being 
visited. Any way icons are drawn aims at emotion- 
ally affecting the user. Even if they do not strictly 
resemble the physical feature, they can prompt a 
reaction. Consider the icon in Figure 9. 

The winded envelope suggests rapidity, quick- 
ness, and all the attributes that recall efficiency and 
trustworthiness. 



FUTURE TRENDS 

In the last section, we have tried to provide a sketch 
about the role of abduction in human-computer 
interaction. Several questions still remain. We have 
illustrated some examples related to visual, spatial, 
and emotional abduction. However, other abductive 
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aspects can be addressed. For instance, it would be 
useful to investigate the temporal dimension. How is 
it possible to keep good track of the history of a 
user’s actions? 

Another question is related to the role of meta- 
phors and analogies in designing interfaces. We 
have pointed out that metaphors are important in 
interface design, because they relate the digital 
objects to the objects in the physical world. How- 
ever, the physical objects that we may find in the real 
world also are cultural ones; that is, they belong to a 
specific cultural setting. Something that is familiar in 
a given context may turn out to be obscure or even 
misleading in another one. Here, the task required is 
to investigate how cultural aspects may be taken into 
account to avoid misunderstandings. 

CONCLUSION 

In this article, we have claimed that interfaces play 
a key role in understanding the human-computer 
interaction. Referring to it as a mediating structure, 
we also have shown how human-computer interac- 
tions can be understood better using an inferential 
model. The interface provides clues from which the 
user can infer correctly how to cope with a product. 
Hence, we have referred to that inferential process 
as genuinely abductive. In the last section, we 
suggested possible future trends relying on some 
examples from Web interfaces design. 
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KEY TERMS 

Abduction: The explanatory process of infer- 
ring certain facts and/or hypotheses that explain or 
discover some phenomenon or observation. 

Affordance: Can be viewed as a property of an 
object that supports certain kinds of actions rather 
than others. 

Enriching Digital World: The process of em- 
bedding and encoding those clues or signs within an 
interface from which the user is enabled to exploit 
the functionalities of a certain product. 

Inference to the Best Interaction: Any inter- 
face provides some clue from which the user is 
enabled to perform the correct action in order to 
accomplish tasks with a product. The process of 
inferring the correct action can be called inference 
to the best interaction. 



Interface: The way a user interacts with a 
product, what he or she does, and how it responds. 

Mediating Structure: What coordinates the 
interaction between a user and a tool providing 
additional computational resources that simplify the 
task. 

Stigmergy: The process that mediates the inter- 
action between animals and animals through the 
environment and provides those clues from which 
any agent is able to infer what to do next. 

ENDNOTES 

1 The concept of affordance is akin to that of 
stigmergy. We might say that the change stored 
within the environment affords a certain reac- 
tion rather than others. For more information 
about the concept of affordance, see Gibson 
(1979). 

2 There are several problems related to meta- 
phors in interface design. For further informa- 
tion on this topic, see Collins (1995). 

3 About the inferential role of perception, see 
Thagard and Shelley (1997). 

4 For further details about visual, spatial, and 
temporal abduction, seeMagnani (2001). About 
emotion as an abductive inference, see Peirce 
(1972). 
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INTRODUCTION 

Nowadays, the use of computers and Internet in 
education is on the increase. Web-based educational 
systems (WES) are now widely used to both provide 
support to distance learning and to complement the 
traditional teaching process in the classroom. 

To be very useful in the classroom, one of the 
characteristics expected in a WES is the ability to be 
aware of students’ behaviors so that it can take into 
account the level of knowledge and preferences of 
the students in order to make reasonable recommen- 
dations (Hong, Kinshuk, He, Patel, & Jesshope, 
2001 ). 

The main goal of adaptation in educational sys- 
tems is to guide the students through the course 
material in order to improve the effectiveness of the 
learning process. 

Usually, when speaking of adaptive Web-based 
educational systems, we refer also to adaptable 
systems. Nevertheless, these terms are not really 
synonyms. Adaptable systems are abundant (Kobsa, 
2004). In these systems, any adaptation is pre- 
defined and can be modified by the users before the 
execution of the system. In contrast, adaptive sys- 
tems are still quite rare. In adaptive systems, any 
adaptation is dynamic which changes while the user 
is interacting with the system, depending on users’ 
behaviors. 

Nowadays, adaptable and adaptive systems re- 
cently gained strong popularity on the Web under the 
notion of personalized systems. A system can be 
adaptable and adaptive at the same time. 

In educational context, adaptable systems in- 
clude also those systems that allow the teacher to 
modify certain parameters and change the response 
that the system gives to the students. 



In this situation, we claim that, in educational 
context, it is important to provide both types of 
personalization. On one hand, it is necessary to let 
teachers control the adaptation to students. On the 
other hand, due to a great diversity of interactions 
that take place in a WES, it is necessary to help 
teachers in the assessment of the students’ actions 
by providing certain dynamic adaptation automati- 
cally performed by the system. In this article, we will 
present how we can obtain adaptable and adaptive 
systems. Next, we will briefly present how we 
combine both types of personalization in 
PDINAMET, a WES for Physics. Finally we will 
describe some future trends and conclusions. 



BACKGROUND 

To provide personalization, systems store the infor- 
mation needed in the so-called models for adapta- 
tion. These models contain information about users’ 
characteristics and preferences (the so-called user 
model). Educational systems also need information 
about the domain that is being taught (the so-called 
domain model) and the pedagogical strategies that 
will be followed when guiding students (the so-called 
pedagogical model). The first systems in incorpo- 
rating these models were the Intelligent Tutoring 
systems (Wenger, 1987). 

These models usually make use of an attribute- 
value representation. The value of each attribute 
can be obtained directly from the users by means of 
initial questionnaires (for example, to acquire per- 
sonal data). Other attributes can be directly obtained 
from the data that the system logs from the users’ 
interaction (for example, number of course pages 
visited) (Gaudioso & Boticario, 2002). Neverthe- 
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less, certain attributes can neither be directly ob- 
tained from the user nor from the data logged, and 
they must be automatically inferred by the system. 

Various methods can be used to infer these at- 
tributes (Kobsa, 2004). These methods include simple 
rules that predict user’s characteristics or assign 
users to predetermined user groups with known char- 
acteristics when certain user actions are being ob- 
served (the so-called profiles or stereotypes). 

The main disadvantage of the rule-based ap- 
proach is that the rules (and the profiles) have to be 
pre-defined, and so all the process is very static. To 
make the process more dynamic, there exist other 
methods to obtain the value of those attributes 
(Kobsa, 2004). Probabilistic reasoning methods take 
uncertainty and evidences from users’ characteris- 
tics and interaction data into account (Gertner, Conati, 
& VanLehn, 1998). Plan recognition methods aim at 
linking individual actions of users to presumable 
underlying plans and goals (Dario, Mendez, Jimenez, 
& Guzman, 2004). Machine learning methods try to 
detect regularities in users’ actions (and to use the 
learned patterns as a basis for predicting future 
actions) (Soller & Lesgold, 2003). 

These systems can be considered adaptive since 
the adaptation is dynamic and it is not controlled by 
users. For example, once a rule is defined, it usually 
cannot be modified. Another review (Brusilovsky & 
Peylo, 2003) differentiates between adaptive and 
intelligent WES; authors consider adaptive systems 
as those that attempt to be different for different 
students and groups of students by taking into ac- 
count information accumulated in the individual’ s or 
group’s student models. On the other hand, they 
consider intelligent systems as those that apply 
techniques from the field of artificial intelligence to 
provide broader and better support for the users of 
WES. In many cases, Web-based adaptive educa- 
tional systems fall into the two categories. Accord- 
ing to our classification, both adaptive and intelligent 
systems can be considered as adaptive. We think 
that our classification is more appropriate from a 
user’s point of view. A user does not usually first 
care about how the adaptation is being done but if 
she or he can modify this adaptation. A complete 
description of the personalization process (distin- 
guishing between adaptive and adaptable systems) 
can be found at Kobsa, Koenemann, and Pohl (2001) 
and Oppermann, Rashev, and Kinshuk (1997). 



As mentioned earlier, in the educational domain, 
it is necessary to let teachers control the personal- 
ization process. Thus it seems necessary to combine 
capabilities of both adaptive and adaptable systems. 

A HYBRID APPROACH TO WEB- 
BASED EDUCATIONAL SYSTEMS 

In this section, we present PDINAMET, a system 
that provides both types of personalization. 

PDINAMET is a Web-based adaptive and adapt- 
able educational system directed to the teaching of 
dynamics within the area of the physics. In 
PDINAMET, we maintain three types of models: 
student model, domain model, and pedagogical model. 

Besides personal data, the student model con- 
tains information about the students’ understanding 
of the domain. The domain model includes informa- 
tion about the contents that should be taught. Finally, 
pedagogical model includes information about in- 
structional strategies. 

In PDINAMET, we consider an adaptation 
task as any support provided by the system to 
learners and teachers taking into account learners’ 
personal characteristics and knowledge level. In 
PDINAMET, two types of adaptation tasks are 
considered: static (that makes PDINAMET adapt- 
able) and dynamic (that makes PDINAMET adap- 
tive). Static adaptation tasks are those based on pre- 
defined rules. These include assigning students to 
pre-defined profiles (that can be modified by teach- 
ers) in which PDINAMET based certain recom- 
mendations (e.g., recommend a suitable exercise). 
To come up from the lack of coverage of pre-defined 
rules for every situation that could arise during the 
course, some dynamic tasks are performed in 
PDINAMET. These tasks include: diagnosis of stu- 
dent models, intelligent analysis of students’ interac- 
tions, and recommend instructional strategies 
(Montero & Gaudioso, 2005). 

FUTURE TRENDS 

We have seen that teachers should have access to 
the models in order to inspect or modify them. From 
this point of view, an open question is how and when 
we should present the models to a teacher in a 
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friendly and comprehensible manner and how we can 
allow interactive refinement to enrich these models. 

From the point of view of technology, educational 
domain poses some challenges in the development of 
adaptive systems. In educational domain, there exists 
a great amount of prior knowledge provided by 
teachers, which must be incorporated in the adapta- 
tion mechanisms. As an example, when using ma- 
chine learning, it is necessary to efficiently combine 
knowledge and data. A long-standing fundamental 
problem with machine learning algorithms is that they 
do not provide easy ways of incorporating prior 
knowledge to guide and constrain learning. 

CONCLUSION 

In this article, we have described the differences 
between adaptable and adaptive systems. Adaptable 
systems allow the users to modify the personalization 
mechanism of the system. For example, Web portals 
permit users to specify the information they want to 
see and the form in which it should be displayed by 
their Web browsers. Nevertheless, this process is 
very static. On the other hand, adaptive systems 
automatically change the response given to the users 
taking into account users’ characteristics, prefer- 
ences and behavior. In WES, there is a necessity of 
combining both capabilities to let teachers control the 
guidance given to the students and to help them in the 
assessment of the students’ actions. Nevertheless, 
there exist some open issues mainly regarding the 
way in which we can present the teachers with this 
functionality and how we can dynamically introduce 
the feedback from the teachers in the models for 
adaptation. Probably, if we make progress in this 
direction, we will be able to provide teachers with 
more useful Web-based adaptive systems. 
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KEY TERMS 

Adaptable Systems: Systems, which offer per- 
sonalization that is pre-defined before the execution 
of the system and that may be modified by users. 

Adaptive Systems: Systems, which offer per- 
sonalization that is dynamically built and automati- 
cally performed based on what these systems learn 
about the users. 

Domain Model: A model that contains informa- 
tion about the course taught in a WES. A usual 
representation is a concept network specifying con- 
cepts and their relationships. 



Pedagogical Model: A model that contains 
information about the pedagogical strategies which 
will be followed when making recommendations. 

Personalization: Ability of systems to adapt 
and provide different responses to different users, 
based on knowledge about the users. 

Student Model: A user model in educational 
systems that also includes information about the 
student, for example, the student’s level of knowl- 
edge. 

User Model: A model that contains information 
about users’ characteristics and preferences. 

Web-Based Educational System: An educa- 
tional system that supports teaching through the 
Web. 
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INTRODUCTION 

Currently, organizations are under a regime of rapid 
economic, social, and technological change. Such a 
regime has been impelling organizations to increase 
focus on innovation, learning, and forms of enter- 
prise cooperation. To assure innovation success and 
make it measurable, it is indispensable for members 
of teams to systematically exchange information and 
knowledge. 

McLure and Faraj (2000) see an evolution in the 
way knowledge exchange is viewed from “knowl- 
edge as object” to “knowledge embedded in people,” 
and finally as “knowledge embedded in the commu- 
nity.” 

The collaborative community is a group of people, 
not necessarily co-located, that share interests and 
act together to contribute positively toward the 
fulfillment of their common goals. The community ’ s 
members develop a common vocabulary and lan- 
guage by interacting continuously. They also create 
the reciprocal trust and mutual understanding needed 
to establish a culture in which collaborative prac- 
tices pre-dominate. Such practices can grasp and 
apply the tacit knowledge dispersed in the organiza- 
tion, embodied in the people’s minds. Tacit knowl- 
edge is a concept proposed by Polanyi (1966) mean- 
ing a kind of knowledge that cannot be easily 
transcripted into a code. It can be profitably applied 
on process and/or product development and produc- 
tion. Therefore, community members can power- 



fully contribute to the innovation process and create 
value for the organization. In doing so, they become 
a fundamental work force to the organization. 

BACKGROUND 

A collaborative community emerges on searching 
for something new. It can rise spontaneously or in 
response to a firm request. In both cases, each 
volunteer can evaluate whether it is interesting to 
become a member of the group or not. 

Whenever a firm needs to make a decision 
whether it is feasible to develop a new product, 
usually it asks its senior engineers (experts) techni- 
cal opinions about the undertaking. The best solution 
depends on information such as: fitness of the cur- 
rent production processes considering the new prod- 
uct features, design requirements, characteristics of 
the materials needed, time constraints, and so forth. 
In short, it requires assessment and technical opin- 
ions from many firms’ experts. 

Depending on the product ’ s complexity , priority , 
constraints, and so forth, experts start exchanging 
opinions with those to whom they truly know to be 
competent in the subject. As the forthcoming news 
about the new product spreads, a potential collabo- 
rative community can emerge and make the firm’s 
technical experience come afloat. This occurs as the 
experts evaluate how much the firm’s production 
processes fit the new product requirements, how 
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many similar products it already has designed, what 
are the product’s parts that could be assembled on a 
partnership schema, and so on. Such opinions strongly 
support the decision making process related to the 
new product feasibility. 

Every community can develop both individual 
and collective competence by creating, expanding, 
and exchanging knowledge. Mutual collaboration 
among members of a community, as well as with 
other groups, pre-supposes the ability to exchange 
information and to make the knowledge available to 
others. Therefore, collaborative communities should 
be pervasive, integrated, and supported by the enter- 
prise rules that limit and/or direct people’ s actions in 
organizations. 

Recommender systems (Table 1) search and 
retrieve information according to users’ needs; they 
can be specialized on users’ profiles or on the users’ 
instantaneous interests (e.g., when users browse the 
Web). “Recommender systems use the opinions of 
members of a community to help individuals in that 
community identify the information or products most 
likely to be interesting to them or relevant to their 
needs” (Konstan, 2004, p. 1). By discoveringpeople’s 
interests and comparing whether such interests are 



the same or similar, recommender systems aid indi- 
viduals to form collaborative communities. 

Usually, this kind of system is based on artificial 
intelligence technologies such as machine learning 
algorithms, ontologies, and multi-agent systems. Such 
technologies can be used separately or combined in 
different ways to find and deliver information to the 
people who require it . Ontologies are well-suited for 
knowledge sharing as they offer a formal base for 
describing terminology in a knowledge domain 
(Gruber, 1995; McGuinness, 2001). 

Recommender systems can be classified as col- 
laborative filtering with and without content analysis 
and knowledge sharing. Collaborative filtering with 
content analysis is based on information from trusted 
people, which the system recognizes and also rec- 
ommends. Examples of this kind of system are: 
GroupLens (Konstan et al., 1997), ReferralWeb 
(Kautz et al., 1997) and Yenta (Foner, 1999). Col- 
laborative filtering without content analysis exam- 
ines the meta-information and classifies it according 
to the user’s current context. It can recommend 
multimedia information that otherwise could be too 
complex to be analyzed. PHOAKS (Terveen et al., 
1997) is an example of a system that associates 



Table 1. Recommender systems comparison 





Collaborative filtering 


Knowledge 

sharing 


Multi-agent 


Uses ontology 


Recommends 


Recommender systems 


With content 
analysis 


Without content 
analysis 










Yenta 

(Foner, 1999) 


X 








X 


Scientific papers 


GroupLens 

(Konstan, Miller, Hellocker, 
Gordon, & Riedl, 1997) 


X 










Usenet news 


Referral Web 

(Kautz, Selman, & Shah, 1997) 


X 








X 


Scientists 


Phoaks 

(Terveen, Hill, Amento, 
McDonald, & Creter, 1997) 




X 




X 




Scientists 


OntoShare 

(Davies, Dukes, & Stonkus, 
2002) 






X 


X 




Opinions 


Quickstep 

(Middleton, De Roure, & 
Shadbolt, 2001) 






X 


X 


X 


Web pages 


OntoCoPI 

(Alani, O’Hara, & Shadbolt, 
2002) 






X 


X 




Communities of 
practice 


Sheik 

(Nabuco, Rosario, Silva, & 
Drira, 2004) 






X 


X 


X 


Similar profiles 
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scientific profiles by chaining bibliographic refer- 
ences with no comprehension of the article’s con- 
tents. 



BUILDING COLLABORATIVE 
COMMUNITIES 

This work focuses on the building (or establishment) 
phase of a collaborative community. It is supposed 
that the initial set of candidates to participate in such 
a community are already motivated to that purpose; 
sociological or psychological elements involved in 
their decision are not in the scope of this work. 

The question faced is: How to fulfill a demand for 
dealing with a particular event that arises and drives 
the candidates to search for partners or peers? One 
potential collaborative community could emerge, trig- 
gered by that event and being composed of a subset 
derived from the initial set of candidates. Inside such 
subset, the candidates must recognize themselves as 
peers, as persons sharing particular interests that 
characterize the community. 

The selected approach was to develop a 
recommender system that can compare candidate’s 
profiles and recommend those candidates that share 
similar interests as being potential members of a 
community. From each candidate, the system must 
gather its profile, characterized as a combination of 
static information on location (Name, Project, Orga- 
nization, Address) and dynamic information on skills, 
described as a set of keywords. These profiles are 
then processed, aiming at two objectives: to allow the 
automatic organization of the informal vocabulary 
used by the candidates into formal knowledge domain 
categories and to establish a correlation between 
candidates by selecting those with overlapping areas 
of interest (formal knowledge domains) enabling the 
recommending process. 

If a subset of candidates exists, that one shares 
one or more knowledge domains. Such a subset is a 
potential community, and those candidates are in- 
formed of the existence of such a community, receiv- 
ing information on others candidates’ locations and 
on the overlapping knowledge domains. Now candi- 
dates can proceed in contacting their potential 
peers at will, using their own criteria of personal 
networking. 



One important remark on this process of com- 
munity building is that candidates are not forced to 
participate in the process, and, if they want to, the 
sole requirement is to share their profile with other 
candidates. Privacy is a very important issue on the 
process of collaboration. Candidates are assured 
that their location information will be forwarded 
only to the future potential communities, and that 
their complete profile will not be publicly available. 

An experimental recommender system, named 
SHEIK (Sharing Engineering Information and 
Knowledge), was designed to act as a community- 
building system. SHEIK is based on agents and 
ontologies’ technologies and implemented using 
Java language. It advises candidates on the poten- 
tial collaborative community they could belong to. 
Ontologies are used to relate knowledge domains 
and a candidate’s profile while agents capture each 
candidate’s interests and provide him/her with use- 
ful information. 



SHEIK ARCHITECTURE 

SHEIK was designed in two layers: an interface 
layer responsible forgathering candidates’ profiles 
and a processing layer that consolidates the data 
upon each candidate and enables the process of 
peer finding. 

The interface layer is composed of a set of 
agents, one for each candidate, each agent residing 
in the associated candidate’s computer. The agents 
use knowledge acquisition techniques to search the 
candidate’s workspace (in a way, configured by the 
candidate) and communicate results to the process- 
ing layer in a timely basis (usually daily). So the 
interface agent automatically updates candidate’s 
general information to the system, unburdening the 
candidate from repetitive tasks he/she is unfamiliar 
to. Also, as there is one interface agent for each 
candidate, all of the interface agents operate au- 
tonomously and in parallel increasing system power. 

The processing layer is composed of agents, 
named Erudite agents, forming a federation. Each 
Erudite agent is associated to a particular knowl- 
edge domain and has the ability to process 
candidate’s profiles, using a knowledge base, con- 
structed over a model that associates one ontology 
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(related to its knowledge domain) and a data model 
for the candidate’s profiles. 

SHEIK BEHAVIOR 

For the sake of clarity, let us consider an example of 
forming an engineering team to develop a new 
product in a consortium of enterprises. This situation 
is becoming frequent and permitting to illustrate 
some interesting issues. First of all, the candidates to 
compose the collaborative community probably do 
not recognize their potential peers. Even if they 
could have an initial idea of the candidates set, based 
on their social interaction inside their own enterprise, 
they normally could not have the same feeling out- 
side it, as they could not know in advance other 
external engineering teams. 

Secondly, the use of existing filtering systems, 
based on Internet technologies, is usually not enough 
to find candidates for partnership as, because of 
privacy, enterprises protect their data about both 
engineering processes and personnel qualifications. 
Using trial and error procedures for searching the 
Web is difficult because the available information is 
unstructured and comes in huge quantities. 

From a candidate’s perspective, the functioning 
of SFIEIK can be described as follows: 

After the necessity of creating a new engineer- 
ing team for developing a product arises, managers 
start a SHEIK system, activating a Erudite agent 
containing knowledge in the main area required to 
the development process. Each candidate, inside 
each enterprise, is informed of SHEIK’ s existence 
and advised to register into the system. Candidates 
do their registration; during this brief process, they 
agree to have an interface agent residing in their 
machines, to provide location data, and also to 
determinate where the agent is allowed to search for 
information (usually their computer’s workspace). 
The interface agent is started and automatically 
analyzes the candidate’s workspace, extracting a 
series of keywords that can approximately describe 
the candidate’s interests. The agent repeats this 
operation periodically (for instance daily), so it can 
capture changes in the candidate’s interests, and 
forwards the information to the Erudite agent. Many 
knowledge acquisition techniques can be used; for 
the first prototype, the KEA tool (Witten, Paynter, 



Frank, Gutwin, & Nevill-Manning, 1999) from 
Waikato University, New Zealand, was selected. 

There is one interface agent for each candidate; 
they work automatically and asynchronously gather- 
ing information on their candidates and pushing it to 
the Erudite agent. 

The Erudite agent, upon receiving the informa- 
tion from a particular interface agent, processes that 
information and classifies the candidate (associated 
to the agent) into one or more areas from its main 
knowledge domain. As already mentioned, many 
interface agents are flowing information in parallel 
to the Erudite agent, that is continuously updating its 
knowledge base. After classifying the candidate, the 
Erudite agent is able to perform searches in its 
knowledge base and discover for each candidate 
who the other candidates are that have related 
interests. The result of such a search is a potential 
collaborative community in the retrieved sub area, 
organized as a list of candidates and their compatible 
knowledge sub domains. The candidates pertaining 
to that potential community (list) are informed of 
their potential peers via an e-mail message. Candi- 
dates could analyze the message and decide if they 
want to contact the recommended candidates, and, 
after some interaction, they could decide to formal- 
ize the new community. The Erudite agent uses the 
Protege system (Gennari et al., 2003), developed at 
the Stanford Medical Informatics (SMI), Stanford 
University, USA. Protege is an ontology editor and a 
knowledge-base editor capable of answering queries. 

An experiment demonstrated the potential of 
SHEIK as a tool in a scientific environment. It is 
envisaged to broaden its use and application in other 
domains. SHEIK is being experimented to support 
collaboration among scientific researchers, in the 
field of manufacturing, involving academic institu- 
tions from Brazil (Nabuco et al., 2004). 

FUTURE TRENDS 

Since the advent of Internet browsers, the amount of 
electronic information available has been growing in 
an astonishing fashion, but such information some- 
times cannot be brought to use in its bare form 
because of its lack of structure. Basically searching 
engines use syntactic rules to obtain their results; 
they do not consider the information within its con- 
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text. Efforts are in course to enhance the informa- 
tion through some sort of pre-classification inside 
knowledge domains, done via “clues” hidden in the 
Internet page’s codes. One problem that arises is 
that there are too many knowledge domains and inter- 
national standardized knowledge classifications are 
almost inexistent. This is a challenge too big to be 
solved for only one standardization organism. It must be 
treated by a diversity of organizations bringing to light 
another problem: how to describe knowledge in a way 
that permits us to interchange and complement its 
meaning among such diversity. Research has been 
done in this area but to date no standard for knowledge 
representation has found widespread acceptance. 

The problem dealt with in this work is building a 
collaborative community — such a community being 
characterized by having interests pertaining to cer- 
tain knowledge domains. The problem of formally 
characterizing knowledge domains, aiming to ease 
the interaction and interoperation between knowl- 
edge systems, is an open issue, and such a problem 
is considered as a future path to be followed, having 
the use of best practices and de-facto standards as 
guidelines. 

In particular as the Erudite agents are organized as 
a federation, each agent being responsible for one 
knowledge domain, it is of interest to study the problem 
of knowledge domain determination, a difficult task 
because many recognized knowledge domains overlap 
with each other (are not self-excluding). 

The interface agents were designed to accept 
different knowledge acquisition techniques, with 
minor code modification; experiencing with some of 
those techniques could increase SHEIK’ s capacity. 

CONCLUSION 

An approach for aiding the establishment of a col- 
laborative community was proposed, based on the 
automatic determination of a candidate’ s profile that 
can be resolved to knowledge domains (and sub 
domains). The use of ontologies was advocated as a 
way of formally characterizing the knowledge do- 
mains, easing, in a certain way, the job of knowledge 
treatment and exchange. As a pre-requisite system, 
friendliness was enforced, meaning that the system 
to support such an approach must act automatically, 
aiding the users (candidates to belong to a new 



community) in the process of finding peers to fulfill 
their purposes of collaboration. As a natural out- 
come of this requisite, the prototype system was 
designed as being a multi-agent system, being scal- 
able and capable of autonomous operation. 

A prototype system was constructed, using pre- 
existing knowledge system and knowledge acquisi- 
tion tool, compatible with de-facto standards. In 
operation, the system is capable of dynamically 
following users’ interests as they change with time, 
so it can also aid the self-emergence of new commu- 
nities regardless of managers’ requests. 

Initial results were encouraging, and a greater 
experiment is of course in the manufacturing knowl- 
edge domain. 
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KEY TERMS 

Agent: A software agent is a programmable 
artifact capable of intelligent autonomous action 
toward an objective. 

Collaborative Community: A group of people 
sharing common interests and acting together to- 
ward common goals. 

Erudite Agent: Acts as a broker locating com- 
patible candidates according to specific similarities 
found in their profiles. Erudite has two roles: one is 
upon the requesting of interface agents, it queries its 
knowledge base searching for other candidates that 
seem to have the same interests; the other, is to keep 
its knowledge base updated with the candidate’s 
specific information provided by the resident inter- 
face agents. In doing so, it implements the process- 
ing part of SF1EIK system’s architecture. 

Filtering: A technique that selects specific things 
according to criteria of similarity with particular 
patterns. 

Knowledge Base: A knowledge repository, 
organized according to formal descriptive rules, 
permitting to perform operations over the repre- 
sented knowledge. 

Machine Learning: A computer embedded 
capability of data analysis with the purpose of ac- 
quiring selected characteristics (attributes, patterns, 
behavior) of an object or system. 

Multi-Agent System: A set of software agents 
that interact with each other. Inside this system, 
each agent can act individually in cooperative or 
competitive forms. 

Ontology: In the computer science community, 
this term is employed in the sense of making explicit 
the concepts pertaining to a knowledge domain. 

Recommender System: A computer program 
that aids people to find information by giving recom- 
mendations on the searched subject. It is up to the user 
to select useful data among the recommended ones. 

Tacit Knowledge: Means the knowledge that is 
embodied in people’s mind, not easily visible and 
difficult to be transmitted in words by skilled people. 
The concept was brought to the light by Polanyi (1966). 
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INTRODUCTION 

Traditionally, learning material is delivered in a 
textual format and on paper. For example, a learning 
module on a topic may include a description (or a 
tutorial) of the topic, a few examples illustrating the 
topic, and one or more exercise problems to gauge 
how well the students have achieved the expected 
understanding of the topic. The delivery mechanism 
of the learning material has traditionally been via 
textbooks and/or instructions provided by a teacher. 
A teacher, for example, may provide a few pages of 
notes about a topic, explain the topic for a few 
minutes, discuss a couple of examples, and then give 
some exercise problems as homework. During the 
delivery, students ask questions and the teacher 
attempts to answer the questions accordingly. Thus, 
the delivery is interactive: the teacher learns how 
well the students have mastered the topic, and the 
students clarify their understanding of the topic. In a 
traditional classroom of a relatively small size, this 
scenario is feasible. Flowever, when e-learning ap- 
proaches are involved, or in the case of a large class 
size, the traditional delivery mechanism is often not 
feasible. 

In this article, we describe an interface that is 
“active” (instead of passive) that delivers learning 
material based on the usage history of the learning 
material (such as degree of difficulty, the average 
score, and the number of times viewed), the student’ s 
static background profile (such as GPA, majors, 
interests, and courses taken), and the student’s 
dynamic activity profile (based on their interactions 
with the agent). This interface is supported by an 
intelligent agent ( W ooldridge & Jennings , 1995). An 
agent in this article refers to a software module that 
is able to sense its environment, receive stimuli from 
the environment, make autonomous decisions, and 
actuate the decisions, which in turn change the 
environment. An intelligent agent in this article 
refers to an agent that is capable of flexible behaviour: 



responding to events timely, exhibiting goal-directed 
behaviour, and performing machine learning. The 
agent uses the profiles to decide, through case- 
based reasoning (CBR) (Kolodner, 1993), which 
learning modules (examples and problems) to present 
to the students. Our CBR treats the input situation as 
aproblem, and the solution is basically the specifica- 
tion of an appropriate example or problem. Our 
agent also uses the usage history of each learning 
material to adjust the appropriateness of the ex- 
amples and problems in a particular situation. We 
call our agent Intelligent Learning Material Delivery 
Agent (ILMDA). We have built an end-to-end 
ILMDA infrastructure, with an active GUI front- 
end — that monitors and tracks every interaction step 
of the user with the interface, an agent powered by 
CBR and capable of learning, and a multi-database 
backend. 

In the following, we first discuss some related 
work in the area of intelligent tutoring systems. 
Then, we present our ILMDA project, its goals and 
framework. Subsequently, we describe the CBR 
methodology and design. Finally, we point out some 
future trends before concluding. 

BACKGROUND 

Research strongly supports the user of technology 
as a catalyst for improving the learning environment 
(Sivin-Kachala & Bialo, 1998). Educational technol- 
ogy has been shown to stimulate more interactive 
teaching, effective grouping of students, and coop- 
erative learning. A few studies, which estimated the 
cost effectiveness, reported time saving of about 
30%. At first, professors can be expected to struggle 
with the change brought about by technology. How- 
ever, they will adopt, adapt, and eventually learn to 
use technology effortlessly and creatively (Kadiyala 
& Crynes, 1998). As summarized in Graesser, 
VanLehn, Rose, Jordan, and Harter (2001), intelli- 
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gent tutoring systems (ITSs) are clearly one of the 
successful enterprises in artificial intelligence (AI). 
There is a long list of ITSs that have been tested on 
humans and have proven to facilitate learning. These 
ITSs use a variety of computational modules that are 
familiar to those of us in AI: production systems, 
Bayesian networks, schema templates, theorem prov- 
ing, and explanatory reasoning. Graesser et al. (2001) 
also pointed out the weaknesses of the current state 
of tutoring systems: First, it is possible for students 
to guess and find an answer and such shallow 
learning will not be detected by the system. Second, 
ITSs do not involve students in conversations so 
students might not learn the domain’s language. 
Third, to understand the students’ thinking, the GUI 
of the ITSs tends to encourage students to focus on 
the details instead of the overall picture of a solution. 

There have been successful ITSs such as PACT 
(Koedinger, Anderson, Hadley, & Mark, 1997), 
ANDES (Gertner & VanLehn, 2000), AutoTutor 
(Graesser et al., 2001), and SAM (Cassell et al., 
2000), but without machine learning capabilities. 
These systems do not generally adapt to new cir- 
cumstances, do not self-evaluate and self-configure 
their own strategies, and do not monitor the usage 
history of the learning material being delivered or 
presented to the students. In our research, we aim to 
build intelligent tutoring agents that are able to learn 
how to deliver appropriate different learning mate- 
rial to different types of students and to monitor and 
evaluate how the learning material are received by 
the students. To model students, our agent has to 
monitor and track student activity through its inter- 
face. 



APPLICATION FRAMEWORK 

In the ILMDA project, we aim to design an agent- 
supported interface for online tutoring. Each topic to 
be delivered to the students consists of three compo- 
nents: (1) a tutorial, (2) a set of related examples, and 
(3) a set of exercise problems to assess the student’s 
understanding of the topic. Based on how a student 
progresses through the topic and based on his or her 
background profile, our agent chooses the appropri- 
ate examples and exercise problems for the student. 
In this manner, our agent customizes the specific 
learning material to be provided to the student. Our 



design has a modular design of the course content 
and delivery mechanism, utilizes true agent intelli- 
gence where an agent is able to learn how to deliver 
its learning material better, and self-evaluates its 
own learning material. 

The underlying assumptions behind the design of 
our agent are the following. First, a student’s 
behaviour in viewing an online tutorial, and how he or 
she interacts with the tutorial, the examples, and the 
exercises, is a good indicator of how well the student 
understands the topic in question, and this behaviour 
is observable and quantifiable. Second, different 
students exhibit different behaviours for different 
topics such that it is possible to show a student’s 
understanding of a topic, say, Tl, with an example 
E 1 , and at the same time, to show the same student’ s 
lack of understanding of the same topic Tl with 
another E2, and this differentiation is known and can 
be implemented. These two assumptions require our 
agent to have an active interface — an interface 
that monitors and tracks its interaction with the user. 

Further, we want to develop an integrated, flex- 
ible, easy-to-use database of courseware and 
ILMDA system, including operational items such as 
student profiles, ILMDA success rates, and so forth, 
and educational items such as learner model, domain 
expertise, and course content. This will allow teach- 
ers and educators to monitor and track student 
progress, the quality of the learning material, and the 
appropriateness of the material for different student 
groups. With the ability to self-monitor and evaluate, 
our agent can identify how best to deliver a topic to 
a particular student type with distinctive behaviours. 
We see this as valuable knowledge to instructional 
designers and educational researchers as ILMDA 
not only is a testbed for testing hypotheses, but it is 
also an active decision maker that can expose knowl- 
edge or patterns that are previously unknown to 
researchers. 



MODEL 

Our ILMDA system is based on a three-tier model, 
as shown in Figure 1 . It consists of a graphical user 
interface (GUI) front-end application, a database 
backend, and the ILMDA reasoning in between. A 
student user accesses the learning material through 
the GUI. The agent captures the student’s interac- 
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Figure 1. Overall methodology of the ILMDA system 
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tions with the GUI and provides the ILMDA reason- 
ing module with a parametric profile of the student 
and environment. The ILMDA reasoning module 
performs case-based reasoning to obtain a search 
query (a set of search keys) to retrieve and adapt the 
most appropriate example or problem from the data- 
base. The agent then delivers the example or problem 
in real-time back to the user through the interface. 

Overall Flow of Operations 

When a student starts the ILMDA application, he or 
she is first asked to login. This associates the user 
with his or her profile information. The information is 
stored in two separate tables. All of the generally 
static information, such as name, major, interests, 
and so forth, is stored in one table, while the user’s 
dynamic information (i.e., how much time, on aver- 
age, they spend in each section; how many times they 
click the mouse in each section, etc.) is stored in 
another table. After a student is logged in, he or she 
selects a topic and then views the tutorial on that 
topic. Following the tutorial, the agent looks at the 
student’s static profile, as well as the dynamic ac- 
tions the student took in the tutorial, and searches the 
database for a similar case. The agent then adapts 
the output of that similar case depending on how the 
cases differ, and uses the adapted output to search 
for a suitable example to give to the student. After the 
student is done looking at the examples, the same 
process is used to select an appropriate problem. 
Again, the agent takes into account how the student 
behaved during the example, as well as his or her 



background profile. After the students complete an 
example or a problem, they may elect to be given 
another. If they do so, the agent notes that the last 
example or problem it gave the student was not a 
good choice for that student, and tries a different 
solution. Figure 2 shows the interaction steps be- 
tween our ILMDA agent and a student. 



Figure 2. GUI and interactions between ILMDA 
and students 
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Learner Model 

A learner model is one that tells us the metacognitive 
level of a student by looking at the student ’ s behaviour 
as he or she interacts with the learning material. We 
achieve this by profiling a learner/student along two 
dimensions: student’s background and activity. The 
background of a student stays relatively static and 
consists of the student’s last name, first name, 
major, GPA, goals, affiliations, aptitudes, and com- 
petencies. The dynamic student’s profile captures 
the student’s real-time behaviour and patterns. It 
consists of the student’ s online interactions with the 
GUI module of the ILMDA agent including the 
number of attempts on the same learning material, 
number of different modules taken so far, average 
number of mouse clicks during the tutorial, average 
number of mouse clicks viewing the examples, aver- 
age length of time spent during the tutorial, number 
of quits after tutorial, number of successes, and so 
on. 

In our research, we incorporate the learner model 
into the case-based reasoning (CBR) module as part 
of the problem description of a case: Given the 
parametric behaviour, the CBR module will pick the 
best matching case and retrieve the set of solution 
parameters that will determine which examples or 
exercise problems to pick. 

Case-Based Reasoning 

Each agent has a case-based reasoning (CBR) 
module. In our framework, the agent consults the 
CBR module to obtain the specifications of the 
appropriate types of examples or problems to admin- 
ister to the users. The learner model discussed 
earlier and the profile of the learning material consti- 
tute the problem description of a case. The solution 
is simply a set of search keys (such as the number of 
times the material has been viewed, the difficulty 
level, the length of the course content in terms of the 
number of characters, the average number of clicks 
the interface has recorded for this material, etc.) 
guiding the agent in its retrieval of either a problem 
or an exercise. 

Note that CBR is a reasoning process that de- 
rives a solution to the current problem based on 
adapting a known solution to a previously encoun- 
tered, similar problem to the current one. Applied to 



our application, the information associated with a 
student and a particular topic is the situation state- 
ment of a case, and the solution of the case is 
basically the characteristics of the appropriate ex- 
amples or problems to be delivered to the student. 
When adapting, CBR changes the characteristics of 
the appropriate examples or problems based on the 
difference between the current situation and the 
situation found in the most similar case. CBR allows 
the agent to derive a decent solution from what the 
agent has stored in its casebase instead of coming up 
with a solution from scratch. 

Case-Based Learning (CBL) 

To improve its reasoning process, our agent learns 
the differences between good cases (cases with a 
good solution for its problem space) and bad cases 
(cases with a bad solution for its problem space). It 
also meta-learns adaptation heuristics, the signifi- 
cance of input features of the cases, and the weights 
of a content graph for symbolic feature values. 

Our agent uses various weights when selecting a 
similar case, a similar example, or a similar problem. 
By adjusting these weights, we can improve our 
results, and hence, learn from our experiences. In 
order to improve our agent’ s independence, we want 
to have the agent adjust the weights without human 
intervention. To do this, the agent uses simple meth- 
ods to adjust the weights called learning modules. 
Adjusting the weights in this manner gives us a 
layered learning system (Stone, 2000) because the 
changes that one module makes propagate through 
to the other modules. For instance, the changes we 
make to the similarity heuristics will alter which 
cases the other modules perceive as similar. 

Active Graphical User Interface 

The ILMDA front-end GUI application is written in 
Java, using the Sun Java Swing library. It is active as 
it monitors and tracks every interaction step be- 
tween a student and ILMDA. It stores these activi- 
ties in its database, based on which the ILMDA 
agent maintains a dynamic profile of the student and 
reasons to provide the appropriate learning material. 

For our GUI, a student progresses through the 
tutorials, examples, and problems in the following 
manner. First, the student logs onto the system. If he 
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or she is a new student, then a new profile is created. 
The student then selects a topic to study. The agent 
administers the tutorial associated with the topic to 
the student accordingly. The student studies the 
tutorial, occasionally clicking and scrolling, and 
browsing the embedded hyperlinks. Then when the 
student is ready to view an example, he or she clicks 
to proceed. Sensing this click, the agent immediately 
captures all the mouse activity and time spent in the 
tutorial and updates the student’s activity profile. 
The student goes through a similar process and may 
choose to quit the system, indicating a failure of our 
example, or going back to the tutorial page for 
further clarification. If the student decides that he or 
she is ready for the problem, the agent will retrieve 
the most appropriate one based on the student’s 
updated dynamic profile. 

FUTURE TRENDS 

We expect the field of intelligent tutoring systems to 
make great contribution in the near future as the 
intelligent agent technologies are incorporated to 
customize learning material for different students, 
re-configure their own reasoning process, and evalu- 
ate the quality of the learning material that they 
deliver to the students. Interfaces that are more 
flexible in visualizing and presenting learning mate- 
rial will also be available. For example, a tutorial 
would be presented in multiple ways to suit different 
students (based on students’ background and the 
topic of the tutorial). Therefore, an agent-supported 
interface that is capable of learning would maximize 
the impact of such tutorials on the targeted students’ 
groups. Interfaces, with intelligent agents, will be 
able to adapt to students timely and pro-actively, 
which makes this type of online tutoring highly 
personable. 

CONCLUSION 

We have described an intelligent agent that delivers 
learning material adaptively to different students. 
We have built the ILMDA infrastructure, with a 
GUI front-end, an agent powered by case-based 
reasoning (CBR), and a multi-database backend. 



We have also built a comprehensive simulator for 
our experiments. Preliminary experiments demon- 
strate the correctness of the end-to-end behaviour 
of the ILMDA agent, and show the feasibility of 
ILMDA and its learning capability. Ongoing and 
future work includes incorporating complex learner 
and instructional models into the agent and conduct- 
ing further experiments on each learning mecha- 
nism, and investigating how ILMDA adapts to a 
student’s behaviour, and how ILMDA adapts to 
different types of learning material. 
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KEY TERMS 

Active Interface: An interface that monitors 
and tracks its interaction with the user. 

Agent: A module that is able to sense its envi- 
ronment, receive stimuli from the environment, make 
autonomous decisions, and actuate the decisions, 
which in turn change the environment. 

Casebase: A collection of cases with each case 
containing a problem description and its correspond- 
ing solution approach. 



Case-Based Learning (CBL): Stemming from 
case-based reasoning, the process of determining 
and storing cases of new problem-solution scenarios 
in a casebase. 

Case-Based Reasoning (CBR): A reasoning 
process that derives a solution to the current problem 
based on adapting a known solution to a previously 
encountered, similar problem to the current one. 

Intelligent Agent: An agent that is capable of 
flexible behaviour: responding to events timely, ex- 
hibiting goal-directed behaviour and social behaviour, 
and conducting machine learning to improve its own 
performance over time. 

Intelligent Tutoring System (ITS): A soft- 
ware system that is capable of interacting with a 
student, providing guidance in the student’ s learning 
of a subject matter. 
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INTRODUCTION 

Estimated numbers of scientific journals in print 
each year are approximately close to 70,000-80,000 
(Rowland, McKnight, & Meadows, 1995). Institute 
of Scientific Information (ISI) each year adds over 
1 . 3 million new articles and more than 30 million new 
citations to its science citation databases of 8,500 
research journals. The widely available electronic 
repositories of scientific publications, such as digital 
libraries, preprint archives, and Web-based citation 
indexing services, have considerably improved the 
way articles are being accessed. However, it has 
become increasingly difficult to see the big picture of 
science. 

Scientific frontiers and longer-term developments 
of scientific disciplines have been traditionally stud- 
ied from sociological and philosophical perspectives 
(Kuhn, 1962; Stewart, 1990). The scientometrics 
community has developed quantitative approaches 
to the study of science. In this article, we introduce 
the history and the state of the art associated with 
the ambitious quest for detecting and tracking the 
advances of scientific frontiers through quantitative 
and computational approaches. We first introduce 
the background of the subject and major develop- 
ments. We then highlight the key challenges and 
illustrate the underlying principles with an example. 

BACKGROUND 

In this section, we briefly review the traditional 
methods for studying scientific revolutions, and the 
introduction of quantitative approaches proposed to 
overcome cumbersome techniques for visualizing 
these revolutions. 



The concept of scientific revolutions was de- 
fined by Thomas Kuhn in his Structure of Scientific 
Revolutions (Kuhn, 1962). According to Kuhn, 
science can be characterized by normal science, 
crisis, and revolutionary phases. A scientific revolu- 
tion is often characterized by the so-called para- 
digm shift. 

Many sociologists and philosophers of science 
have studied revolutions under this framework, in- 
cluding the continental drift and plate tectonics 
revolution in geology (Stewart, 1990) and a number 
of revolutions studied by Kuhn himself. Scientists in 
many individual disciplines are very interested in 
understanding revolutions that took place at their 
doorsteps, for example, the first-hand accounts of 
periodical mass extinctions (Raup, 1999), and 
superstring revolutions in physics (Schwarz, 1996). 

Traditional methods of studying scientific revolu- 
tions, especially sociological and philosophical stud- 
ies, are time consuming and laborious; they tend to 
overly rely on investigators’ intimate understanding 
of a scientific discipline to interpret the findings and 
evidence. The lack of large-scale, comparable, timely, 
and highly repeatable procedures and tools have 
severely hindered the widespread adaptation and 
dissemination such research. Scientists, sociolo- 
gists, historians, and philosophers need to have readily 
accessible tools to facilitate increasingly complex 
and time-consuming tasks of analyzing and monitor- 
ing the latest development in their fields. 

Quantitative approaches have been proposed for 
decades, notably in scientometrics, to study science 
itself by using scientific methods, hence the name 
science of science (Price, 1965). Many expect that 
quantitative approaches to the study of science may 
enable analysts to study the dynamics of science. 
Information science and computer science have 
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become the major driving forces behind the move- 
ment. Commonly used sources of input of such 
studies include a wide variety of scientific publica- 
tions in books, periodicals, and conference proceed- 
ings. Subject-specific repositories include the ACM 
Digital Library for computer science, PubMed Cen- 
tral for life sciences, the increasing number of open- 
access preprint archives such as www.arxiv.org, 
and the World Wide Web. 



THE DYNAMICS OF 
RESEARCH FRONTS 

Derek Price (1965) found that the more recent 
papers tend to be cited about six times more often 
than earlier papers. He suggested that scientific 
literature contains two distinct parts: a classic part 
and a transient part, and that the two parts have 
different citation half-lives. Citation half-lives mimic 
the concept of half-life of atoms, which is the amount 
of time it takes for half of the atoms in a sample to 
decay. Simply speaking, classic papers tend to be 
longer lasting than transient ones in terms of how 
long their values hold. The extent to which a field is 
largely classic or largely transient varies widely 
from field to field; mathematics, for example, is 
strongly predominated by the classic part, whereas 
life sciences tend to be highly transient. 

The notion of research fronts is also introduced 
by Price as the collection of highly-cited papers that 
represent the frontiers of science at a particular 
point of time. He examined citation patterns of 
scientific papers and identified the significance of 
the role of a quantitative method for delineating the 
topography of current scientific literature in under- 
standing the nature of such moving frontiers. 

It was Eugene Garfield, the founder of the Insti- 
tute for Scientific Information (ISI) and the father of 
Science Citation Index (SCI), who introduced the 
idea of using cited references as an indexing mecha- 
nism to improve the understanding of scientific 
literature. Citation index has provided researchers 
new ways to grasp the development of science and 
to cast a glimpse of the big picture of science. A 
citation is an instance of a published article a made 
a reference to a published item b in the literature, be 
a journal paper, a conference paper, a book, a 
technical report, or a dissertation. A citation is 



directional, a b. A co-citation is a higher-order 

instance involving three articles, a, b., and b if we 

found both a b. and a b.. Articles b. and b. are 

' j ‘ j 

co-cited. A citation network is a directed graph, 

whereas a co-citation network is an undirected 

graph. 

Researchers have utilized co-citation relation- 
ships as a clustering mechanism. As a result, a 
cluster of articles grouped together by their co- 
citation connections can be used to represent more 
evasive phenomena such as specialties, research 
themes, and research fronts. Much of today’s re- 
search in co-citation analysis is inspired by Small and 
Griffith’s (1974) work in the 1970s, in which they 
identified specialties based on co-citation networks. 
A detailed description of the subject can be found in 
Chen (2003). A noteworthy service is the ISI Essen- 
tial Science Indicators (ESI) Special Topics, launched 
in 2001 . It provides citation analyses and commen- 
taries of selected scientific areas that have experi- 
enced recent advances or are of special current 
interest. A new topic is added monthly. Other impor- 
tant methods include co-word analysis (Callon, 
Law, & Rip, 1986). A fine example of combining co- 
citation and co-word analysis is given by Braam, 
Moed, and van Raan (1991). 

KNOWLEDGE DIFFUSION 

Knowledge diffusion is the adaptation of knowl- 
edge in a broad range of scientific and engineering 
research and development. Tracing knowledge dif- 
fusion between science and technology is a chal- 
lenging issue due to the complexity of identifying 
emerging patterns in a diverse range of possible 
processes (Chen & Hicks, 2004; Oppenheimer, 
2000 ). 

Just as citation indexing to modeling and visual- 
izing scientific frontiers, understanding patent cita- 
tions is important to the study of knowledge diffusion 
(Jaffe & Trajtenberg, 2002). There are a number of 
extensively studied knowledge diffusion, or knowl- 
edge spillover, cases, namely liquid crystal display 
(LCD), and nanotechnology. Knowledge diffusion 
between basic research and technological innova- 
tion is also intrinsically related to the scientific 
revolution. Carpenter, Cooper, and Narin (1980) 
found that nearly 90% of journal references made by 
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patent applicants and examiners refer to basic or 
applied scientific journals, as opposed to engineering 
and technological literature. Research in universities, 
government laboratories, and various non-profit re- 
search institutions has been playing a pivot role in 
technological inventions and innovations (Narin, 
Hamilton, & Olivastro, 1997). On the other hand, 
Meyer (2001) studied patent-to-paper citations be- 
tween nano-science and nanotechnology and con- 
cluded that they are as different as two different 
disciplines. 

Earlier research highlighted a tendency of geo- 
graphical localization in knowledge spillovers, for 
example, knowledge diffusion (Jaffe & Trajtenberg, 
2002). Agrawal, Cockburn, and McHale (2003) found 
that the influence of social ties between collaborative 
inventors may be even stronger when it comes to 
account for knowledge diffusion: Inventors’ patents 
are continuously cited by their colleagues in their 
former institutions. 

The Role of Social Network Analysis in 
Knowledge Diffusion 

Social network analysis is playing an increasingly 
important role in understanding knowledge diffusion 
pathways. Classic social network studies such as the 
work of Granovetter (1973) on weak ties and struc- 
tural holes (Burt, 1992) provide the initial inspira- 
tions. Singh (2004) studied the role of social ties one 
step further by taking into account not only direct 
social ties but also indirect ones in social networks of 
inventors’ teams based on data extracted from U.S. 
Patent Office patents. Two teams with a common 
inventor are connected in the social network. Knowl- 
edge flows between teams are analyzed in terms of 
patent citations. Socially proximate teams have a 
better chance to see knowledge flows between them. 
The chance shrinks as social distance increases. 
More importantly, the study reveals that social links 
offer a good explanation why knowledge spillovers 
appear to be geographically localized. It is the social 
link that really matters; geographic proximity hap- 
pens to foster social links. 

The Key Player Problem 

The key player problem in social network analysis 
is also relevant. The problem is whether a maximum 



or a minimum spread is desired in a social network 
(Borgatti & Everett, 1992). If we want to spread, or 
diffuse, something as quickly or thoroughly as pos- 
sible through a network, where do we begin? In 
contrast, if the goal is to minimize the spread, which 
nodes in the network should we isolate? Borgatti 
found that because off-the-shelf centrality mea- 
sures make assumptions about the way things flow 
in a network, when they are applied to the “wrong” 
flows, they get the “wrong” answers. Furthermore, 
few existing measures are appropriate for the most 
interested types of network flows, such as the flows 
of gossip, information, and infections. From the 
modelling perspective, an interaction between flow 
and centrality can identify who gets things early and 
who gets a lot of traffic (between-ness). The ex- 
ample explained later in this article utilizes the 
between-ness centrality metric to identify pivotal 
points between thematic clusters. 

Analysis Using Network Visualization 

Freeman (2000) identifies the fundamental role of 
network visualization in helping researchers to un- 
derstand various properties of a social network and 
to communicate such insights to others. He pointed 
to interesting trends such as increasingly higher- 
dimensional visualizations, changing from factor 
analysis to scaling techniques such as principle 
component analysis and correspondence analysis, 
more widely used layout algorithms such as spring 
embedder, more and more interactive images with 
color and animation. He envisaged that the greatest 
need for further social network analysis is integra- 
tive tools that enable us to access network datasets, 
compute, and visualize their structural properties 
quickly and easily — all within a single program ! An 
increasing number of social network analysis soft- 
ware becomes available, including Pajek, InFlow, 
and many others. 

The information visualization community has 
also produced a stream of computer systems that 
could be potentially applicable to track knowledge 
diffusion. Examples of visualizing evolving infor- 
mation structure include disk trees and time tubes 
(Chi, Pitkow, Mackinlay, Pirolli, Gossweiler, & 
Card, 1998), which display the changes of a Web 
site over time. Chen and Carr (1999) visualized the 
evolution of the field of hypertext using author co- 
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citation networks over a period of nine years. Ani- 
mated visualization techniques have been used to 
track competing paradigms in scientific disciplines 
over a period of 64 years (Chen, Cribbin, Macredie, 
& Morar, 2002). 

APPLICATIONS 

The CiteSpace application is discussed here. 

CiteSpace is an integrated environment designed 
to facilitate the modeling and visualization of struc- 
tural and temporal patterns in scientific literature 
(Chen, 2004b). CiteSpace supports an increasing 
number of input data types, including bibliographic 
data extracted from the scientific literature, grant 
awards data, digital libraries, and real-time data 
streaming on the Web. The conceptual model of 
Citespace is the panoramic expansion of a series of 
snapshots over time. The goal of CiteSpace is to 
represent the most salient structural and temporal 
properties of a subject domain across a user-speci- 
fied time interval T. CiteSpace allows the user to 
slice the time interval Tinto a number of consecutive 
sub-intervals T., called time slices. A hybrid network 
N. is derived in each T . The time series of networks 

l l 



./V provides the vehicle for subsequent analysis and 
visualization. Network analysis techniques such as 
network scaling can be applied to the time series. 
CiteSpace helps the user to focus on pivotal points 
and critical pathways as the underlying phenomenon 
goes through profound changes. CiteSpace is still 
evolving as it embraces richer collections of data 
types and supports a wider range of data analysis, 
knowledge discovery, and decision support tasks. 

An Example of Trend Analysis in 
CiteSpace 

The example is motivated by the question: What are 
the leading research topics in the scientific literature 
of terrorism research? In this case, we expect 
CiteSpace will reveal themes related to the after- 
math of the terrorist attacks on September 1 1 , 200 1 , 
and some of the earlier ones. 

The initial dataset was drawn from the Web of 
Science using the query “terrorism.” The dataset 
was processed by CiteSpace. Figure 1 is a screenshot 
of a resultant visualization, which is a chain of sub- 
networks merged across individual time slices. The 
merged network consists of two types of nodes: new 
terms found in citing articles (labeled in red) and 



Figure 1. Emerging trends and clusters in the terrorism research literature (Source: Chen, 2004a) 
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Figure 2. Additional functions supported by CiteSpace 




articles being cited in the dataset (labeled in blue). 
The network also contains three types of links: term- 
term occurrence, article-article co-citation, and term- 
to-article citation. The color of a link corresponds to 
the year in which the first instance was found. There 
are three predominating clusters in Figure 1 , labeled 
as follows: (A) post-September 11 th , (B) pre-Sept 
1 1, and (C) Oklahoma bombing. Automatically ex- 
tracted high-fly terms identified the emerging trends 
associated with each cluster. Cluster A is associated 
with post-traumatic stress disorders (PTSD); Clus- 
ter B is associated with heath care and bioterrorism; 
and Cluster C is associated with terrorist bombings 
and body injuries. 

In Figure 2, the merged network shown in 
CiteSpace. Purple-circled nodes are high in be- 
tween-ness centrality. The red rectangle indicates 
the current marquee selection area. Articles in the 
selected group are displayed in the tables along with 
matching medical subject heading (MeSH) indexing 
terms retrieved from PubMed. The nature of the 
selected group is characterized by the top-ranked 
MeSFI major terms — biological warfare (assigned 
to 1 3 articles in the group) . The most frequently cited 
article in this group is in the top row of the table by 
Franz et al. (2001). Its centrality measure of 0.21 is 
shown in the second column. 



The red rectangle in Figure 2 marks an area 
selected by the user. The operation is called mar- 
quee selection. Each node falls into the area is 
selected. In this case, CiteSpace launches instant 
search in PubMed, the largest medical literature 
resource on the Web. If there is a match, the MeSH 
terms assigned by human indexers to the matched 
article will be retrieved. Such terms serve as gold 
standard for identifying the topic of the article. 
Frequently used terms across all articles in the 
marquee selection are ranked and listed in the tables 
located in the lower right area in the screen display. 
For example, the most frequently used MeSH term 
for this cluster is biological warfare, which was 
assigned to 13 articles in the cluster. 

FUTURE TRENDS AND 
CONCLUSION 

Major challenges include the need to detect emer- 
gent trends and abrupt changes in complex and 
transient systems accurately and efficiently, and the 
need to represent such changes and patterns effec- 
tively so that one can understand intuitively the 
underlying dynamics of scientific frontiers and the 
movement of research fronts. Detecting trends and 
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abrupt changes computationally poses a challenge, 
especially at macroscopic levels if the subject is a 
discipline, a profession, or a field of study as a whole. 
Although many algorithms and systems have been 
developed to address phenomena at smaller scales, 
efforts aiming at macroscopic trends and profound 
changes have been rare. The emerging knowledge 
domain visualization (KDviz) studies the evolution of 
a scientific knowledge domain (Chen, 2003). The 
need to assemble data from a diverse range of 
sources also poses a challenge. IBM’s 
DiscoveryLink is a promising example of a middle- 
ware approach to address this challenge in the 
context of life sciences. 

Effective means of visually representing the struc- 
tural and dynamical properties of complex and tran- 
sient systems are increasingly attracting research- 
ers’ attention, but it is still in an early stage. Scalability, 
visual metaphor, interactivity, perceptual and cogni- 
tive task decomposition, interpretability , and how to 
represent critical changes over time are among the 
most pressing issues to be addressed. 

There is a wide-spread need for tools that can 
help a broad range of users, including scientists, 
science policy decision makers, sociologists, and 
philosophers, to discover new insights from the 
increasingly complex information. As advocated in 
Norman (1998), designing task-centered and hu- 
man-centered tools is a vital step to lead us to a new 
paradigm of interacting with complex, time-variant 
information environments. Human-computer inter- 
action holds the key to some of the most critical 
paths towards integrating the new ways of analysis, 
discovery, and decision-making. 
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KEY TERMS 

Citation Indexing: The indexing mechanism 
invented by Eugene Garfield, in which cited work, 
rather than subject terms, is used as the part of the 
indexing vocabulary. 

Knowledge Diffusion : The adaptation of knowl- 
edge in a broad range of scientific and engineering 
research and development. 

Knowledge Domain Visualization (KDviz): 

An emerging field that focuses on using data analy- 
sis, modeling, and visualization techniques to facili- 
tate the study of a knowledge domain, which in- 
cludes research fronts, intellectual basis, and other 
aspects of a knowledge domain. KDviz emphasizes 
a holistic approach to treat a knowledge domain as 
a cohesive whole in historical, logical, and social 
contexts. 

Paradigm Shift: The mechanism of scientific 
revolutions proposed by Kuhn. The cause of a 
scientific revolution is rooted to the change of a 
paradigm, or a view of the world. 

Research Fronts: A transient collection of 
highly-cited scientific publications by the latest pub- 
lications. Clusters of highly co-cited articles are 
regarded as a representation of a research front. 

Scientific Revolutions: Rapid and fundamen- 
tal changes in science as defined by Thomas Kuhn 
in his Structure of Scientific Revolutions (Kuhn, 
1962). 
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INTRODUCTION 

HCI has grown up with the desktop; as the special- 
ized tools used for serious scientific endeavor gave 
way first of all to common workplace and then to 
domestic use, so the market for the interface has 
changed, and the experience of the user has become 
of more interest. It has been said that the interface, 
to the user, is the computer — it constitutes the 
experience — and as the interface has become richer 
with increasing processing power to run it, this 
experiential aspect has taken center stage 
(Crampton-Smith & Tabor, 1992). Interaction de- 
sign has focused largely on the interface as screen 
with point-and-click control and with layered inter- 
active environments. More recently, it has become 
concerned with other modes of interaction; notably, 
voice-activated controls and aural feedback, and as 
it emerges from research laboratories, haptic inter- 
action. Research on physicalizing computing in new 
ways, on the melding of bits and atoms, has produced 
exciting concepts for distributed computing but si- 
multaneously has raised important questions regard- 
ing our experience of them. Work in tangible and 
ubiquitous computing is leading to the possibility of 
fuller sensory engagement both with and through 
computers, and as the predominance of visual inter- 
action gives way to a more plenary bodily experi- 
ence, pragmatism alone no longer seems a sufficient 
operative philosophy in much the same way that 
visual perception does not account solely for bodily 
experience. 

Interaction design and HCI in their 
interdisciplinarity have embraced many different 
design approaches. The question of what design is 
has become as important as the products being 
produced, and computing has not been backward in 
learning from other design disciplines such as archi- 
tecture, product design, graphics, and urban planning 
(Winograd, 1992). However, despite thinkers writ- 
ing that interaction design is “more like art than 
science” (Crampton-Smith & Tabor, 1992, p. 37), it 



is still design with a specific, useful end. It is obvious, 
for example, how user-centered design in its many 
methods is aimed at producing better information 
systems. In knowing more about the context of use, 
the tasks the tool will be put to, and the traits of the 
users, it hopes to better predict patterns and trajec- 
tories of use. The holy grail in the design of tools is 
that the tool disappears in use. Transparency is all; 
Donald Norman (1999) writes that “technology is 
our friend when it is inconspicuous, working smoothly 
and invisibly in the background ... to provide comfort 
and benefit” (p. 115). 

It is tempting to point to the recent trend for 
emotional design as a step in the right direction in 
rethinking technology’s roles. But emotional design 
does not reassess design itself; in both its aims and 
methods, emotional design remains closely tied to 
the pragmatic goals of design as a whole. Both are 
concerned with precognition — good tools should be 
instantly recognizable, be introduced through an 
existing conceptual framework, and exhibit effec- 
tive affordances that point to its functionality; while 
emotional design seeks to speak to the subconscious 
to make us feel without knowing (Colin, 2001). 
These types of design activity thus continue to 
operate within the larger pragmatic system, which 
casts technology as a tool without questioning the 
larger system itself. More interesting is the emerging 
trajectory of HCI, which attempts to take account of 
both the precognitive and interpretive to “construct 
a broader, more encompassing concept of ‘usabil- 
ity’” (Carroll, 2004, pp. 38-40). 

This article presents art as a critical methodology 
well placed to question technology in society, further 
broadening and challenging the HCI of usability. 

BACKGROUND 

Artists work to develop personal visual languages. 
They strive toward unified systems of connotative 
signifiers to create an artistic whole. They draw and 
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redraw, make and remake, engaging directly with 
sources of visual and sensory research and with 
materials, immanently defining their own affective 
responses, and through a body of work present their 
world for open reading (Eco, 1989; Eldridge, 2003; 
Greenhalgh, 2002). Artists of all kinds commonly 
keep notebooks or sketchbooks of ideas for develop- 
ment along with explorations for possible expression 
of those ideas. They habitually collect and analyze 
source material and work through strands of thought 
using sketches and models, simultaneously defining 
the aspect of experience they are interested in 
representing and finding ways of manifesting that 
representation. 

What is Represented? 

Debate about what is represented in art tends to 
highlight issues surrounding Cartesian duality. Com- 
monly, processes of depiction and description might 
seem, through their use of semiotic systems, to be 
centered around the object out there; the desktop 
metaphor in HCI is a good example. This apparent 
combination of objectivity with the manifold subjec- 
tivity involved in reading art poses philosophical 
problems, not least of which is the nature of that 
which is represented in the work. 

Merleau-Ponty defines the phenomenological 
world as “not the bringing to explicit expression of a 
pre-existing being, but the laying down of being,” and 
that art is not the “reflection of a pre-existing truth” 
but rather “the act of bringing truth into being” 
(Merleau-Ponty, 2002, pp. xxii-xxiii). Thus, when 
we talk about a representation, it should be clear that 
it is not symbolic only of an existing real phenom- 
enon, whether object or emotion, but exists instead 
as a new gestalt in its own right. 

Bearing this in mind, we may yet say that the 
artist simultaneously expresses an emotion and makes 
comment upon it through the means of the material- 
ity of the work. Both these elements are necessary 
for art to exist — indeed, the very word indicates a 
manipulation. Without either the emotional source 
(i.e., the artist’ s reaction to the subject matter) or the 
attendant comment (i.e., the nature of its material- 
ity), there would appear to be “no art, only empty 
decorativeness” (Eldridge, 2003, pp. 25-26). This is 
where design can be differentiated as pragmatic in 
relation to art: although it may be a practice “situated 



within communities, ... an exploration . . . already in 
progress prior to any design situation” (Coyne, 1995, 
p. 1 1), design lacks the aboutness of art, which is why 
the position for HCI as laid out here is critical as 
opposed to pragmatic. 

What is Read? 

Meaning making is an agentive process not only for 
the artist but also for the audience; a viewer in 
passive reception of spectacle does not build mean- 
ing or understanding in relation to his or her own 
lifeworld; the viewer is merely entertained. The 
created artwork is experienced in the first instance 
as a gestalt; in a successful work, cognitive trains of 
thought are triggered, opening up “authentic routes 
of feeling” in the viewer (Eldridge, 2003, p.71). The 
difficulties in talking about art have been explicated 
by Susanne Langer as hinging on its concurrent 
status as expression for its maker and as impression 
for its audience (Langer, 1953). However, this is the 
nature of any language, which is manipulated not just 
to communicate explicit information but as a social 
activity geared toward consensual understanding 
(Winograd & Flores, 1986). The “working through 
undertaken by the artist” is “subsequently followed 
and recapitulated by the audience” (Eldridge, 2003, 
p.70). Just as phenomenology sees language as a 
socially grounded activity (e.g., in the speech acts 
theory) (Winograd & Flores, 1986), so art as a 
language is also primarily a process of activity 
among people. The artwork is a locus for discourse, 
engaged with ordinary life and, indeed, truth (Farrell 
Krell, 1977; Hilton, 2003; Ziarek, 2002), as is phe- 
nomenology, expressing and inviting participation in 
the social activity of meaning making (Eldridge, 
2003; Greenhalgh, 2002; McCarthy & Wright, 2003). 
The temptation to see artists’ disengagement from 
society as irrelevant to more user-centered prac- 
tices is, therefore, misconceived. Empowerment of 
the user or audience occurs within both processes, 
only at different points, and with ramifications for 
the nature of the resulting artifacts. It is argued here 
that involving the user directly in the design process 
correspondingly lessens the need for the user to 
actively engage with the final artifact and, con- 
versely, that removing the user from the process in 
turn demands the user’ s full emotional and cognitive 
apprehension in interaction with the product. The 
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user/audience thus is invited into discourse with the 
artist and other viewers rather than with themselves, 
as the work provides a new ground or environment 
for social being. 

Simultaneity: How Things Are Read 

The way in which artists, their audiences, and users 
apprehend artifacts is of central importance to the 
way in which HCI may attempt to understand inter- 
action. This section briefly introduces a new direc- 
tion in phenomenologically informed methodologies 
as a promising way forward. 

Phenomenology says we cannot step outside the 
world, yet this is supposedly what artists do as a 
matter of course. People cannot remove themselves 
from the worlds they find themselves in, because 
those worlds and individuals create each other simul- 
taneously through action (Merleau-Ponty, 2002). But 
this occurs at a macro level within which, of course, 
humans continue to engage in cognitive processes. 
The artist indeed does attempt to examine his or her 
own sensual and emotional reactions to phenomena 
in the world through agentive attention to them and 
uses intellectual means to produce sensual work. In 
this respect, the artist popularly might be said to stand 
back from his or her own immediate reactions, but the 
artist does not and cannot be said to step outside of 
his or her world as a phenomenological unity (Winograd 
& Flores, 1986), in which case, the artist more 
precisely could be said to be particularly adept at 
disengagement within his or her world and practiced 
at shifting his or her domain of concern within an 
experience: 

The emotion is, as Wordsworth puts it, “recollected 
in tranquillity” . There is a sense of working 
through the subject matter and how it is 
appropriate to feel about it ... Feeling here is 
mediated by thought and by artistic activity. The 
poet must, Wordsworth observes, think ‘long and 
deeply’ about the subject, “for our continued 
influxes of feeling are modified and directed by 
our thoughts. (Eldridge, 2003, p.70) 

This process has come to be understood as one of 
deeper engagement within the feeling process, within 
the forming of consciousness itself. Flere is Augusto 
Boal, Brazilian theater director, on the technicalities 



of acting, echoing contemporary theories of con- 
sciousness: 

The rationalisation of emotion does not take 
place solely after the emotion has disappeared, 
it is immanent in the emotion, it also takes place 
in the course of an emotion. There is a 
simultaneity of feeling and thinking. (Boal, 1992, 
p. 47) 

This finds corroboration in Dennett's theory of 
consciousness, which likens the brain to a massive 
parallel processor from which narrative streams 
and sequences emerge, subjected to continual edit- 
ing (Dennett, 1991). In light of this, we also might 
describe the artist not so much as distanced from his 
or her observed world but as practiced at probing 
his or her own changing consciousness of phenom- 
ena and reflecting on the value of the resulting 
narratives. 

We have seen, then, that art is a process not only 
of representation but also of philosophical question- 
ing and narrative, and that as a provider of grounds 
for discourse and meaning making, it plays a crucial 
role in the healthy progression of society. The 
question, of course, is how this might be of practical 
use to FICI. 



FUTURE TRENDS 

It goes without saying that we are not all profes- 
sional artists, and that of those who are, not all are 
able to create successful works like those we have 
just attempted to describe. If the description of the 
art process and its products is a struggle, putting the 
theory into action can only be more so. As it already 
does in the domain of tools, FICI, nevertheless, 
should seek to do two things: to understand the 
creation process of this type of computational arti- 
fact and to understand the perceptions of the people 
who interact with them. As a basis of any method- 
ology in understanding the arts, Susanne Langer 
(1953) pointed to the qualitative and even phenom- 
enological, emphasizing the need for us to “know 
the arts, so to speak, from the inside ... it is in fact 
impossible to talk about art without adapting to . . . 
the language of the artist” (p. ix). The trend for 
phenomenologically informed methodologies in HCI 
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is also growing. Recent developments, especially in 
the growing relations between phenomenology and 
cognitive science, are now setting a precedent for a 
“first person treatment of HCI ... a phenomenologi- 
cally-informed account of the study of people using 
... technologies” (Turner, 2004; Ihde, 2002). 

Understanding art may not be as daunting as it 
first appears. HCI already knows how to conduct 
phenomenologically informed inquiries, is used to 
methods revolving around the analysis of conversa- 
tion and language, and has a history of cross- 
disciplinary teams. Product designers frequently are 
involved in teams looking at tools; artists and art 
theorists now should be able to provide valuable 
insight and be placed in teams engaged in the devel- 
opment of non-tools. This section now looks at the 
work of Anthony Dunne followed by that of the 
author as practical examples of putting this broad 
approach into practice. 

Anthony Dunne 

Through conceptual genotypes, Anthony Dunne 
(1999) has sought to develop an aesthetics of use 
based upon a product’s behavior rather than just its 
form, and to extend preconceptions of our “subjec- 
tive relationship with the world, our active relation- 
ship with other people” (p. 5). Thief of Affections 
was conceptualized as being based on an alternative 
user persona, an “otaku,” or obsessive individual. 
Seeking intimacy, this user would be able to “techno- 
logically grope the victim’s heart” (Dunne, 1999, p. 
97). Dunne’s (1999) creative process followed two 
lines — an investigation of the how, and the what 
like. In keeping with his criteria for value fictions as 
opposed to science fictions (Dunne, 1999), the tech- 
nology had to be believable although not imple- 
mented. The final concept has the thief stealing 
weak radio signals from unsuspecting victims’ pace- 
makers, using the narrative prop developed with a 
sensitivity to the connotative aspects of its materials. 
The design concept is presented in a series of 
photographs rather than as a conventional prototype, 
emphasising its “psycho-social narrative possibili- 
ties” (Dunne, 1999, p. 100). In this and in a later 
work with Fiona Raby can be seen a concern with 
bracketing or framing concept designs in such a way 
as to encourage the audience both to contextualize 



them within their own lives and to question those 
technologized lives. Dunne’s (1999) problems with 
audience perceptions of his work when shown in 
galleries led to later works like the Placebo project 
to be placed with participants within their own 
homes over a period of time. The self-selecting 
participants, or “adopters” of the products, filled in 
application forms detailing experiences and attitudes 
to electronic products, and at the end of their allotted 
time, they were interviewed (although it is not made 
clear if the gallery audience also was interviewed). 
Ideas and details from these interviews then were 
translated into a series of photographs expressing 
the central findings of the project (Dunne & Raby, 
2002,2002a). 

Interactive Jewelery 

This project differs in its process from Dunne’s 
(1999) in its emphasis on source materials and works 
through sketches and models (Figure 1) toward a 
finished working product: in this case, interactive 
jewelery using ProSpeckz prototype technology (see 
SpeckNet.org.uk). The aims of the project are to 
investigate how contemporary craft as an art disci- 
pline may benefit HCI reciprocally. The demonstra- 
tion pendant (Figure 2) is fabricated from acrylic, 
formica, mild steel, and gold leaf, and interacts with 
other pieces in the same series to map social inter- 
action, specifically modes of greeting, through dy- 
namic LED displays. The design is user-centered 
only in the sense that fashion is user-centered; that 
is, there is a target audience (a friendship group of 



Figure 1. Source and sketches 
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Figure 2. Interactive pendant 




six women), and common elements of lifestyle and 
style preference are observed, including a notable 
consumption of art. The designs thus start with 
source material of interest to the group (e.g., gar- 
dens, landscapes) and bear in mind colors that this 
age group more commonly wears (e.g., pale blue, 
mauves, greens, greys), but beyond this, there is no 
iterative design process as HCI would currently 
understand it. Instead, the pieces of jewelery are 
presented as finished and tested through observation 
of the out-of-the-box experience and of longitudinal 
usage as a group. In the manner of an art object, they 
are more or less provocative and seductive, and are 
succeeded quickly by further work. 

Other research presents critical design examples 
as the main deliverable, with the central aim of the 
research as the provocation of dialogue within the 
HCI community. These often promote computa- 
tional elements as material in their own right (Hallnas 
et al., 2001; Hallnas & Redstrom, 2002; Hallnas et 
al., 2002a; Heineretal., 1999; Orth, 200 1; Post etal., 
2000) or reflect on the application of a methodology 
based on an inversion of the form-follows-function 
leitmotif (Hallnas et al., 2002b). 

While this shift in attention to the experiential is 
relatively new, it is apparent that conflicts arise in 
the balancing of roles of designer and user. Most 
obvious is that the methods of a user-centered 
approach simply cannot be transposed to meet the 
needs of the artist-centered methodology. The re- 
examination of the end goals in design requires no 
less a thorough reworking of the means to their 
realization. This is illustrated amply by Vitaly Komar 



and Alex Melamid’s scientific approach to art. 
Asking people questions like What’s your favorite 
color? and Do you prefer landscapes to por- 
traits? they produced profoundly disturbing exhibi- 
tions of perfectly user-centered art (Wypijewski, 
1999; Norman, 2004). 

CONCLUSION 

Methodologies in HCI have always been borrowed 
from other disciplines and combined with still others 
in new ways. The challenge now, caused by a 
corresponding paradigm shift occurring in philoso- 
phy and cognitive science, is to examine the end 
goals of usability, transparency, and usefulness in 
HCI and to understand that the processes of reach- 
ing our goals have as much import to our ways of 
living and experience as the goals themselves. In 
recognition of the artwork as both expressive and 
impressive, and in view of HCI’ s twofold interest in 
development processes and trajectories of use, the 
following are suggested for consideration toward a 
complementary methodology for human-computer 
interaction: 

• That the creative process be far less user- 
centered than we have been used to, placing 
trust in the role of the artist. 

• That this creative process be studied through 
phenomenologically informed methods. 

• That computational elements be approached as 
material. 

• That the work is seen as open for subjective 
reading, while retaining the voice of its author 
or the mark of its maker. 

• That a body of theory and criticism be built to 
support the meaning of this type of work, much 
in the way discourse is created in the art world. 

• That trajectories of consumption or use be 
described through phenomenologically informed 
methods. 

Tools help to do things, but art helps us see why 
we do them. As a basis for a complementary meth- 
odology, art offers an understanding of ourselves in 
action in a way that instrumentality does not. 
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KEY TERMS 

Aesthetics of Use: The behavioral aspects of 
an artifact, system, or device that precipitate an 
aesthetic experience based on temporal as well as 
spatial elements. 

Artwork: An actively created physical instance 
of a narrative. Created by a working through of 
sensory perception and presented as an invitation to 
social discourse. 

Author: Much of the philosophy extended in 
analyses of technology stems from literary theory. 
Author has connotations synonymous with artist 
and designer and may be useful in future discus- 
sions regarding terminology across more or less 
user-centered processes. 



Genotype: Anthony Dunne’s alternative to the 
prototype — a non-working yet complete product 
specifically aimed at provoking fictive, social, and 
aesthetic considerations in an audience. 

Materiality: The way in which the manner of 
material realization of an idea on the part of the artist 
or designer implicates subsequent experience of it. 

Phenomenology: A strand of philosophy that 
accounts for human action without mental represen- 
tation. Martin Heidegger (1889-1976), currently 
demanding rereading in the HCI community, is one 
of the most important thinkers in this field. 

Trajectories of Use: No designer, artist, or 
developer can predict the outcomes of the many and 
consequential readings or uses to which a public will 
put his work. The work is a starting point of social 
interaction, not an end in itself. 

Transparency: Also known as disappearance, 
transparency is largely considered the hallmark of 
good interaction design, wherein the user is able to 
complete tasks without cognitive interference caused 
by the interface. The user is said to act through the 
computer rather than to interact with it. 

User: The normative term used in the develop- 
ment of information devices, tools, and systems. Has 
been criticized for its impersonal, non-performative 
implications. Alternative suggestions include actor, 
audience, reader, and observer, depending on 
context. No satisfactory term has been agreed upon 
that might take into account all of these contexts. 
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INTRODUCTION 

Much information science research has focused on 
the design of systems enabling users to access, 
communicate, and use information quickly and effi- 
ciently. However the users’ ability to exploit this 
information is seriously limited by finite human cog- 
nitive resources. In cognitive psychology, the role of 
attentional processes in allocating cognitive resources 
has been demonstrated to be crucial. Attention is 
often defined as the set of processes guiding the 
selection of the environmental stimuli to be attended. 
Access to information therefore is not only regulated 
by its availability but also by the users’ choice to 
attend the information — this choice being governed 
by attentional processes. Recently several research- 
ers and practitioners in Human Computer Interac- 
tion (HCI) have concentrated on the design of 
systems capable of adapting to, and supporting, 
human attentional processes. These systems, that 
often rely on very different technologies and theo- 
ries, and that are designed for a range of applica- 
tions, are called attention-aware systems (AAS). 
In the literature, these systems have also been 
referred to as Attentive User Interfaces (Vertegaal, 
2003). However, we prefer using the former name 
as it stresses the fact that issues related to attention 
are relevant to the design of the system as a whole 
rather than limited to the interface. The recent 
interest in this field is testified by the publication of 
special issues in academic journals (e.g., Communi- 
cation of the ACM, 46(3), 2003; International 
Journal of Human-Computer Studies, 58(5), 2003) 
and by the organisation of specialised fora of discus- 
sion (e.g., the workshop on “Designing for Atten- 
tion”; Roda & Thomas, 2004). 

In this article, we discuss the rationale for AASs 
and their role within current HCI research, we 



briefly review current research in AASs, and we 
highlight some open questions for their design. 

BACKGROUND: RATIONALE FOR 
AND ROLE OF ATTENTION-AWARE 
SYSTEMS 

In this section, we analyze the rationale for AASs 
and we discuss their role in HCI research. 

Why Attention-Aware Systems? 

Information overload is one of the most often men- 
tioned problems of working, studying, playing, and 
generally living in a networked society. One of the 
consequences of information overload is the fast 
shift of attention from one subject to another or one 
activity to another. In certain situations, the ability to 
quickly access several information sources, to switch 
activities, or to change context is advantageous. In 
other situations, it would be more fruitful to create 
and maintain a focus while offering the possibility to 
switch attention to other contents or activities only 
as a background low-noise open choice. System 
awareness about the cost/benefits of attentional 
shifts with respect to the users’ goals is essential in 
environments where (1) attentional switches are 
very often solicited, or (2) where the users’ lack of 
experience with the environment makes it harder for 
them to select the appropriate attentional focus, or 
(3) where an inappropriate selection of attentional 
focus may cause serious damage to the system, its 
users, or third parties. Systems relying highly on 
multi-user interaction, such as virtual communities 
and certain systems supporting cooperative work, 
are examples of environments where attentional 
switches are often solicited. Online educational sys- 
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terns are examples of environments where the lack 
of knowledge and experience of users with the 
subject at hand makes it harder for them to select the 
appropriate attentional focus and may easily cause a 
loss of focus. Life critical systems are examples of 
environments where an inappropriate selection of 
attentional focus may cause serious damage. The 
need for AAS s is quite widespread especially if one 
considers that assessing, supporting, and maintain- 
ing users’ attention may be desirable in other envi- 
ronments such as entertainment and e-commerce. 

Attention-Aware Systems in 
HCI Research 

A large portion of research on human attention in 
digital environments is based on the findings of 
cognitive psychology. For example, Raskin (2000) 
analyses how single locus of attention and habit 
formation have important consequences on human 
ability to interact with computers. Fie proposes that 
habit creation is a mechanism that can be used to 
shift the focus of users from the interface to the 
specific target task. 

This study follows the classic “direct manipula- 
tion” school (Shneiderman, 1992, 1997) which aims 
at supporting the attentional choices of the user by 
making the device “transparent” so that the user can 
focus on the task rather than on the interface. The 
wide range of systems designed with this aim is often 
referred to as transparent systems, a term also 
employed in ubiquitous computing (Abowd, 1999; 
Weiser, 1991). 

Another area of research focuses instead on 
designing interfaces and systems capable of guiding 
the users in the choice of attentional focus. The 
system is seen as proactive, visible, and capable of 
supporting the users in their choices. These types of 
systems are often designed as artificial agents 
(Bradshaw, 1997; Huhns & Singh, 1997) acting as 
proactive helpers for the user (Maes, 1994; 
Negroponte, 1997), and they are frequently referred 
to as proactive/adaptive systems. 

The two approaches are often regarded as diver- 
gent: (1) responding to different needs and (2) 
requiring different design choices. However this is 
not necessarily the case, as it should become appar- 
ent from the following discussion of these two 
alleged differences on users’ needs and design 



choices. Concerning the ability to respond to user 
needs, consider for example, one of the metaphors 
most often used for proactive systems : Negroponte ’ s 
English butler (Negroponte, 1995). “The best meta- 
phor I can conceive of for a human-computer inter- 
face is that of a well-trained English butler. The 
‘agent’ answers the phone, recognizes the callers, 
disturbs you when appropriate, and may even tell a 
white lie on your behalf. The same agent is well 
trained in timing, versed in finding the opportune 
moments, and respectful of idiosyncrasies. People 
who know the butler enjoy considerable advantage 
over a total stranger. That is just fine” (p. 150). Isn’t 
this proactive/adaptive system an exquisite example 
of a transparent system? The English butler cer- 
tainly knows to disappear when it is the case, but he 
is there when required and is capable of proactive 
behavior such as selecting the calls you may want to 
receive or even telling a joke if appropriate! Con- 
cerning the design choices, a few considerations 
should be made. First of all, any system needs to be 
proactive in certain situations (e.g. , reporting errors) 
and transparent in others. Secondly, certain applica- 
tions, in particular those where the user has a good 
knowledge of the most effective attentional focus, 
require mostly transparent interfaces, while certain 
others, where the user is more in need of guidance, 
require more proactive interfaces. Also the users’ 
needs, the system’ s functionality, and the use that is 
made of the system, may change with time. There- 
fore, it may be desirable for a system, that is initially 
very proactive, to slowly become transparent, or 
vice-versa. Finally, applications exist where the user 
is expected to focus on the system/interface itself, 
that is, digital art. As a consequence, just as proac- 
tive adaptive behaviors may not always be desirable, 
transparency itself may, under certain conditions, 
not be desirable. 

This brings us to another reason for studies 
related to AASs. In the last two decades, there has 
been a shift on the use and market of Information 
and Communication Technologies (ICT) from strictly 
task oriented (work related) to more of a pervasive 
personal and social use of these technologies. Per- 
forming a task or achieving a goal may not be the 
main target of the user who instead may turn to ICT 
artifacts for their symbolic or affective value, enter- 
tainment, or pleasure in general; see, for example, 
Lowgren’s arguments for Interactive Design versus 
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classic HCI in Lowgren (2002). Capturing and main- 
taining user attention may then actually be the ulti- 
mate goal of the system. 

The real challenge of modern interface design is 
therefore at the meta-level. We should not aim at 
designing transparent or proactive systems. Rather 
we should aim at designing systems capable of rea- 
soning about users’ attention, and consequently de- 
cide how best to disappear or to gain and guide users’ 
attention. Focusing on attentional mechanisms also 
provides a framework that reconciles the direct 
manipulation user interfaces approach and the inter- 
face agents approach as clearly presented and exem- 
plified by Horvitz (1999). 

HUMAN ATTENTION AND SYSTEMS 
CAPABLE OF SUPPORTING IT 

This section briefly reviews the work done so far in 
AASs; for a more extensive review, see Roda and 
Thomas (2006). It should be noted that attention has 
not often been prioritised as a specific subject of 
research in HCI (with some notable exceptions in- 
cluding the Attentional User Interface project at 
Microsoft research [Horvitz, Kadie, Paek, & Hovel, 
2003]). As a consequence, much of the work rel- 
evant to the development of AASs appears in the 
context of other research frames. This is especially 
the case as attention processes are related to, and 
necessary for, the successful accomplishment of 
many diverse activities. 

Human attention has been widely researched in 
cognitive psychology and, more recently, in neu- 
ropsychology. Although there is no common agree- 
ment on a definition of “attention”, attention is gener- 
ally understood as the set of processes allowing 
humans to cope with the, otherwise overwhelming, 
stimuli in the environment. Attention therefore refers 
to the set of processes by which we select informa- 
tion (Driver, 2001; Uttal, 2000). These processes are 
mainly of two types: endogenous (i.e., guided by 
volition) and exogenous (i.e., guided by reaction to 
external stimuli). Given this view of attention as a 
selection of external stimuli, it is obvious that atten- 
tion is somehow related to human sensory mecha- 
nisms. Visual attention, for example, has been widely 
studied in cognitive psychology, and it is particularly 
relevant to HCI since the current predominant mo- 



dality for computer-to-human communication is 
screen display. Using the results of psychological 
studies in visual attention, some authors have pro- 
posed visual techniques for notification displays 
that aim at easy detection while minimising distrac- 
tion (Bartram, Ware, & Calvert, 2003). Attention 
on modalities other than visual, as well as attention 
across modalities, have not been investigated to the 
same extent as visual attention. However, Bearne 
and his colleagues (Bearne, Jones, & Sapsford- 
Francis, 1994) propose guidelines for the design of 
multimedia systems grounded in attentional mecha- 
nisms. 

Systems capable of supporting and guiding user 
attention must, in general, be able to: (1) assess the 
current user focus, and (2) make predictions on the 
cost/benefits of attention shifts (interruptions). We 
conclude this section with a review of the work 
done so far in these two directions. 

Several sensory-based mechanisms for the de- 
tection of users’ attention have been employed, 
including gaze tracking (Hyrskykari, Majaranta, 
Aaltonen, & Raiha, 2000; Vertegaal, 1999; Zhai, 
2003), gesture tracking (Hinckley, Pierce, Sinclair, 
& Horvitz, 2000), head pose and acoustic tracking 
(Stiefelhagen, 2002). Horvitz and his colleagues 
(2003) propose that sensory-based mechanisms 
could be integrated with other cues about the cur- 
rent users’ focus. These cues could be extracted 
from users’ scheduled activities (e.g., using online 
calendars), users’ interaction with software and 
devices, and information about the users and their 
patterns of activity and attention. In any case, even 
when employing mechanisms capable of taking into 
account all these cues, a certain level of uncertainty 
about users’ focus, activities, goals, and best future 
actions will always remain and will have to be dealt 
with within the system (Horvitz et al., 2003). 

The problem of evaluating the cost/benefit of 
interruptions has been researched mostly in the 
context of notification systems (Brush, Bargeron, 
Gupta, & Grudin, 2001; Carroll, Neale, Isenhour, 
Rosson, & McCrickard, 2003; Czerwinski, Cutrell, 
& Horvitz, 2000; Hudson et al., 2003; McCrickard, 
Chewar, Somervell, & Ndiwalana, 2003b; 
McCrickard, Czerwinski, & Bartram, 2003c). This 
research aims at defining the factors determining 
the likely utility of a given information, for a given 
user, in a given context, and the costs associated 
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with presenting the information in a certain manner, 
to the user, in that context. McCrickard and Chewar 
(2003) integrate much of the research in this direc- 
tion and propose an attention- utility trade-off model. 

FUTURE TRENDS 

AASs will be crucial for the development of applica- 
tions in a wide variety of domains including educa- 
tion, life critical systems (e.g., air traffic control), 
support to monitor and diagnosis, knowledge man- 
agement, simulation of human-like characters, games, 
and e-commerce. In order to unleash the whole 
potential of these systems however, there are many 
fundamental aspects of attention, of the mechanisms 
that humans use to manage it, and of their application 
in digital environments that require further explora- 
tion. As it will result obvious from the description 
below, this exploration would greatly benefit from a 
more interdisciplinary approach to the design of 
AASs. First, although a very significant amount of 
research on human attention has been undertaken in 
psychology, several FICI researchers agree that the 
reported theories are often too far removed from the 
specific issues relevant to human computer interac- 
tion to be easily applied to this field of research 
(McCrickard et al., 2003c) and that more focused 
research in this direction is needed (Horvitz et al., 
2003). 

A second important issue in the design of AASs 
is the definition of parameters against which one 
could measure their efficiency. In their work on 
notification systems, McCrickard and his colleagues 
(McCrickard, Catrambone, Chewar, & Stasko, 
2003a) advance a proposal in this direction; how- 
ever, further discussion is needed in order to achieve 
an agreement on parameters that are generally 
accepted. 

Third, although the visual modality has been 
extensively researched in cognitive psychology and 
FICI, this work is mostly focused on still images. 
How would the principles apply to moving images? 

Fourth, much work remains to be done on modali- 
ties other than visual. In particular, research on 
attention in speech (from phonetics to semantics and 
rhetoric) (Argyle & Cook, 1976; Clark, 1996; Grosz 
& Sidner, 1990) could be fruitfully applied to HCI 
research in AASs. Distribution of attention over 



several modalities is a field that also deserves fur- 
ther research. 

Fifth, most of the work on the evaluation of the 
cost/benefits of interruptions has been done taking 
the point of view of the user being interrupted; such 
analysis, however, should also take into account the 
cost/benefit to the interrupter, and the joint cost/ 
benefit (Hudson, Christensen, Kellogg, & Erickson, 
2002; O’Conaill & Frohlich, 1995). 

Sixth, certain aspects of human attention related 
to social and aesthetic processes have been largely 
disregarded in current research. How could these 
processes be taken into consideration? Furthermore, 
most of the target applications in AASs assume that 
the user is in a “work’Vtask-oriented situation. How 
would AAS design apply to different situations (play, 
entertainment)? 

CONCLUSION 

AASs are systems capable of reasoning about user 
attention. In a task-oriented environment, such sys- 
tems address the problem of information overload by 
striving to select and present information in a manner 
that optimizes the cost/benefit associated with us- 
ers’ shifts of attentional focus between contexts and 
tasks. In this article, we have reviewed the work 
done so far in this direction. We have also indicated 
some issues related to the future development of 
AASs. Among these, the most significant ones are 
the need to further investigate the application of 
AASs in environments that are not task-oriented, 
and the need to take into account collaborative 
situations when evaluating the cost/benefit of 
attentional shifts. 
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KEY TERMS 

Direct Manipulation User Interfaces: Inter- 
faces that aim at making objects and actions in the 
systems visible by [graphical] representation. They 
were originally proposed as an alternative to com- 
mand line interfaces. The system’s objects and 
actions are often represented by metaphorical icons 
on screen (e.g., dragging a file to the recycle bin for 
deleting a file). Designers of direct manipulation 
user interface strive to provide incremental revers- 
ible operations and visible effects. 

Endogenous Attentional Processes: Refers 
to the set of processes of voluntary (conscious) 
control of attention. These processes are also re- 
ferred to as top-down or goal-driven. An example of 
endogenous attentional mechanism is the attention 
you are paying at this page as you are reading. 
Endogenous attention is voluntary; it requires ex- 
plicit effort, and it is normally meant to last. 

Exogenous Attentional Processes: Refers to 
the set of processes by which attention is captured 
by some external event. These processes are also 
referred to as bottom-up or stimulus-driven. An 
example of this mechanism would be the attention 
shift from your reading due to a sudden noise. 
Exogenous attention is triggered automatically, and 
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it normally lasts a short time before it is either shifted 
or becomes controlled by endogenous processes. 

Gaze Tracking: The set of mechanisms allow- 
ing to record and analyse human eye-gaze. Gaze 
tracking is normally motivated by the assumption 
that the locus of eye-gaze may, to some extent, 
correspond to the locus of attention, or it can help 
capturing user interests. Several techniques exist for 
eye tracking varying in their level of intrusion (from 
requiring the user to wear special lenses to just 
having camera-like devices installed on the com- 
puter), their accuracy, and ease to use. Normally 
devices need to be calibrated before use (some 
systems allow to memorise calibrations for specific 
users). 

Gesture Tracking: The set of mechanisms 
allowing to record and analyse human motion. Ges- 
ture may be tracked either in 2D or 3D. Gesture 
tracking ranges from the recording and analysis of 
postures (e.g., head, body) to that of more detailed 
elements such as hand-fine movement or facial 
expression. The aims of gesture tracking in HCI 



span from recognising the user’s current activity (or 
lack of), to recognising emotional states. Gesture 
tracking is often used in combination with gaze 
tracking. 

Locus of Attention: Among all sensory input, 
the locus of attention is the input to which one 
allocates mental resources. Input that falls outside 
the locus of attention may go absolutely unnoticed. 
An example of locus of attention is a specific section 
of a computer screen. 

Visual Attention: The process by which we 
select the visual information most relevant to our 
current behaviour. In general, of all the visual stimuli 
we receive, we only attend to a few, this determines 
what we “see.” Visual attention controls the selec- 
tion of appropriate visual stimuli both by pruning 
irrelevant ones and by guiding the seeking of rel- 
evant ones. Research in visual attention aims at 
understanding the mechanisms by which human 
sensory and cognitive systems regulate what we 
see. 
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INTRODUCTION 

Building systems that are correct by design has 
always been a major challenge of software develop- 
ment. Typical software development approaches 
(and in particular interactive systems development 
approaches) are based around the notion of 
prototyping and testing. However, except for simple 
systems, testing cannot guarantee absence of er- 
rors, and, in the case of interactive systems, testing 
with real users can become extremely resource 
intensive and time-consuming. Additionally, when a 
system reaches a prototype stage that is amenable to 
testing, many design decisions have already been 
made and committed to. In fact, in an industrial 
setting, user testing can become useless if it is done 
when time or money is no longer available to sub- 
stantially change the design. 

To address these issues, a number of discount 
techniques for usability evaluation of early designs 
were proposed. Two examples are heuristic evalu- 
ation, and cognitive walkthroughs. Although their 
effectiveness has been subject of debate, reports 
show that they are being used in practice. These 
are largely informal approaches that do not scale 
well as the complexity of the systems (or the 
complexity of the interaction between system and 
users) increases. In recent years, researchers have 
started investigating the applicability of automated 
reasoning techniques and tools to the analysis of 
interactive systems models. The hope being that 
these tools will enable more thorough analysis of 
the designs. 

The challenge faced is how to fold human fac- 
tors’ issues into a formal setting as that created by 
the use of such tools. This article reviews some of 
the work in this area and presents some directions 
for future work. 



BACKGROUND 

As stated earlier, discount usability analysis methods 
have been proposed as a means to achieve some 
degree of confidence in the design of a system from 
as early as possible in development. Nielsen and 
Molich (1990) proposed a usability inspection method 
based on the assumption that there are a number of 
general characteristics that all usable systems should 
exhibit. The method (heuristic evaluation) involves 
systematic inspection of the design by means of 
guidelines for good practice. Applying heuristic evalu- 
ation involves setting up a team of evaluators to 
analyze the design of the user interface. Once all 
evaluators have performed their analysis, results are 
aggregated thus providing a more comprehensive 
analysis of the design. To guide analysis, a set of 
design heuristics is used based on general purpose 
design guidelines. Over the years, different sets of 
heuristics have been proposed for different types of 
systems. The set proposed by Nielsen (1993) com- 
prises nine heuristics: simple and natural dialog', 
speak the user’s language', minimize user memory 
load', be consistent', provide feedback', provide 
clearly-marked exits', provide short cuts', good 
error messages', and prevent errors. 

Usability inspection provides little indication of 
how the analyst should check whether the system 
satisfies a guideline. Cognitive walkthrough (Lewis, 
Poison, Wharton, & Rieman, 1990) is one technique 
that provides better guidance to the analyst. Its aim 
is to analyze how well the interface will guide the 
user in performing tasks. User tasks must first be 
identified, and a model of the interface must be built 
that covers all possible courses of action the user 
might take. Analysis of how a user would execute 
the task is performed by asking three questions at 
each stage of the interaction: Will the correct 
action be made sufficiently evident to users?', 
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Will users connect the correct action ’s descrip- 
tion with what they are trying to achieve?', and 
Will users interpret the system ’s response to the 
chosen action correctly? Problems are identified 
whenever there is a “no” answer to one of these 
questions. 

Informal analytic approaches such as those de- 
scribed pose problems for engineers of complex 
interactive systems. For complex devices, heuristics 
such as “prevent errors” can become too difficult to 
apply and validate. Cognitive walkthroughs provide 
more structure but will become extremely resource 
intensive as systems increase in complexity and the 
set of possible user actions grows. 

To address these issues, researchers have started 
looking into the application of automated reasoning 
techniques to models of interactive systems. These 
techniques are generally more limited in their appli- 
cation. This happens both because of the cost of 
producing detailed initial models and because each 
tool performs a specific type of reasoning only. 
Nevertheless, they have the potential advantage that 
they can provide a precise description that can be 
used as a basis for systematic mechanical analysis in 
a way that would not otherwise be possible. 

Automated theorem proving is a deductive ap- 
proach to the verification of systems. Available 
theorem provers range from fully interactive tools to 
provers that, given a proof, check if the proof is 
correct with no further interaction from the user. 
While some systems provide only a basic set of 
methods for manipulating the logic, giving the user 
full control over the proof strategy, others include 
complex tactics and strategies, meaning the user 
might not know exactly what has been done in each 
step. Due to this mechanical nature, we can trust a 
proof done in a theorem prover to be correct, as 
opposed to the recognized error prone manual pro- 
cess. While this is an advantage, it also means that 
doing a proof in a theorem prover can be more 
difficult, as every little bit must be proved. 

Model checking was proposed as an alternative 
to the use of theorem provers in concurrent program 
verification (Clarke, Emerson, & Sistla, 1986). The 
basic premise of model checking was that a finite 
state machine specification of a system can be 
subject to exhaustive analysis of its entire state 
space to determine what properties hold of the 
system’ s behavior. By using an algorithm to perform 



exhaustive state space analysis, the analysis be- 
comes fully automated. A main drawback of model 
checking has to do with the size of the finite state 
machine needed to specify a given system: useful 
specifications may generate state spaces so large 
that it becomes impractical to analyze the entire 
state space. The use of symbolic model checking 
somewhat diminishes this problem. Avoiding the 
explicit representation of states and exploiting state 
space structural regularity enable the analysis of 
state spaces that might be as big as 10 20 states 
(Burch, Clarke, & McMillan, 1990). The technique 
has been very successful in the analysis of hardware 
and communication protocols designs. In recent 
years, its applicability to software in general has also 
become a subject of interest. 

AUTOMATED REASONING FOR 
USABILITY EVALUATION 

Ensuring the quality (usability) of interactive sys- 
tems’ designs is a particularly difficult task. This is 
mainly due to the need to consider the human side of 
the interaction process. As the complexity of the 
interaction between users and devices increases, so 
does the need to guarantee the quality of such 
interaction. This has led researchers to investigate 
the applicability of automated reasoning tools to 
interactive systems development. 

In 1995, Abowd, Wang, and Monk (1995) showed 
how models of interactive systems could be trans- 
lated into SMV (Symbolic Model Verifier) models 
for verification. SMV (McMillan, 1993) is a sym- 
bolic model checker, at the time being developed at 
Carnegie Mellon University, USA (CMU). They 
specified the user interface in a propositional pro- 
duction systems style using the action simulator tool 
(Curry & Monk, 1995). The specification was then 
analyzed in SMV using computational tree logic 
(CTL) formulae. The authors proposed a number of 
templates for the verification of usability related 
properties. The questions that are proposed are of 
the type: “ Can a rule somehow be enabled “ Is 
it true that the dialogue is deadlock free?”', or 
“Ca/7 the user find a way to accomplish a task 
from initialization?” . 

The modeling approach was quite naive and 
enabled the expression of models at a very high level 



46 



Automated Deduction and Usability Reasoning 



of abstraction only. Roughly at the same time, Paterno 
(1995), in his D.Phil thesis, proposed an approach 
based on the LOTOS specification language (Lan- 
guage of Temporal Ordering Specifications). Device 
models were derived from the task analysis and 
translated into the Lite tool (LOTOS Integrated Tool 
Environment). The models that could be verified with 
this approach were far more elaborate than with 
Abowd et al.’s (1995), but the translation process 
posed a number of technical difficulties. The lan- 
guage used to express the Lite models (Basic LOTOS) 
was less expressive than the language used for the 
modeling of the system (LOTOS interactors). Nev- 
ertheless, a number of property templates were 
proposed for checking the specification. These were 
divided into interactor, system integrity, and user 
interface properties. Regarding user interface prop- 
erties, templates fell into three broad classes: 
reachability, visibility, and task related. Reachability 
was defined as: “... given a user action, it is possible 
to reach an effect which is described by a specific 
action.” (Paterno 1995, p. 103). A11 properties are 
expressed in terms of actions. This was due to there 
being no notion of a system state in the models and 
logic used. 

The main difference between both approaches 
comes exactly from the specification notations and 
logics used. Abowd et al. (1995) adopted a simple 
and easy to use approach. The approach might be too 
simple, however. In fact, for the verification to be 
useful, it must be done at an appropriate level of 
detail, whereas action simulator was designed for 
very high level abstract specifications. Paterno (1995) 
avoids this problem by using a more powerful speci- 
fication notation. This, however, created problems 
when specifications needed to be translated into the 
model checker’s input language (which was less 
expressive). 

In the following years, a number of different 
approaches was proposed, using not only model 
checking but also theorem proving. Most of this work 
was reported on the DSV-IS series of workshops. 
d’Ausbourg, Durrieu, and Roche (1996) used the 
data flow language Lustre. Models were derived 
from UIL descriptions and expressed in Lustre. 
Verification is achieved by augmenting the interface 
with Lustre nodes modeling the intended properties 
and using the tool Lesar to traverse the automaton 
generated from this new system. The use of the same 



language to model both the system and its proper- 
ties seems to solve some of the problem of transla- 
tion in Paterno’ s approach, but the language was 
limited in terms of the data types available. 

Bumbulis, Alencar, Cowan, and Lucena (1996) 
showed how they were using HOL (a Higher Order 
Logic theorem prover) in the verification of user 
interface specifications. They specified user inter- 
faces as sets of connected interface components. 
These specifications could then be implemented in 
some toolkit as well as modeled in the higher order 
logic of the HOL system for formal verification. An 
immediately obvious advantage of this approach is 
that the formalism used to perform the analysis, 
Higher Order Logic, was, at the same level of 
expressiveness of the formalism, used to write the 
specification. So, again, the translation problems of 
Paterno’ s approach could be avoided. The logic 
used, however, could not easily capture temporal 
properties. What was specified was not so much 
the interaction between the users and the interface, 
but the interface architecture and how the different 
components communicate with each other. Al- 
though the approach used a powerful verification 
environment, it had two main drawbacks. The speci- 
fication style and the logic used did not allow 
reasoning about some of the important aspects of 
interaction, and the verification process was quite 
complex. 

Dwyer, Carr, and Hines (1997) explored the 
application of abstraction to reverse engineer toolkit- 
based user interface code. The generated models 
were then analyzed in SMV. This is a different type 
of approach since it does not rely on developers 
building models for verification. Instead, models are 
derived from the code. 

Doherty, Massink, and Faconti (2001) applied 
HyTech, a tool for reachability analysis in hybrid 
automata, to the analysis of the flight deck instru- 
mentation concerning the hydraulics subsystem of 
an aircraft. The use of hybrid automata enabled the 
analysis of continuous aspects of the system. 

One of the characteristics of model checking is 
that all possible interactions between user and 
device will be considered during the verification 
step. While this enables a more thorough analysis of 
the design, in many situations only specific user 
behaviors will be of interest. To address this, Doherty 
et al. (2001) propose that a model of the user be 
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explicitly built. However, the user model used was 
very simplistic: it corresponded simply to all the 
actions that can be performed by the user. 

Rushby (2002) also used a model of the user in his 
work. In this case, the user model was built into a 
previously-developed model of the system, and it 
defined the specific sequences of actions the user is 
expected to carry out. The analysis was performed 
in Murp (the Mur0 verification system), a state 
exploration tool developed at Stanford University, 
USA (Dill, 1996), and the author used it to reason 
about automation surprise in the context of an air- 
craft cockpit. Also in the context of the analysis of 
mode confusion in digital flight decks, there has been 
work carried out at NASA Langley Research Cen- 
ter (Liittgen & Carreno, 1999). The models used 
related to the inner working of the system’s mode 
logic, while the goal of the other approaches men- 
tioned herein is to build the models of the user 
interface. While this latter view might be criticized 
from the point of view that not all application logic is 
presented at the interface, it allows better explora- 
tion of the interaction between the user and the 
system, and not simply of how the system reacts to 
commands. 

Campos and Harrison (Campos, 1999; Campos 
& Harrison, 2001) used SMV, but models were 
expressed in Modal Action Logic and structured 
around the notion of interactor (Duke & Harrison, 
1993). This enabled richer models where both state 
information and actions were present. Originally, 
only device models were built for verification. When- 
ever specific user behaviors needed to be discarded 
from the analysis, this was done in the properties to 
be verified instead of in the model. More recently, 
Campos has shown how it is possible to encode task 
information in the model so that only behaviors that 
comply with the defined tasks for the system are 
considered (Campos, 2003). 

A different style of approach is proposed in 
Blandford and Good (1998) and Thimbleby (2004). 
In this case, the modeling process is centered around 
a cognitive architecture (Programmable User Mod- 
els) that is supposed to simulate a user. This archi- 
tecture is programmed with knowledge about the 
device, and it is then run together with a device 
model. Observation of the joint behavior of both 
models is performed in order to identify possible 
errors. 



The approach is not based on model checking nor 
theorem prover, rather on simulation. Hence, it 
cannot provide the thoroughness of analysis of the 
other approaches. The main drawback, however, is 
the cost of programming the user model. According 
to the authors, it seldom is cost effective to develop 
a full model. Instead they argue that the formaliza- 
tion process alone gives enough insight into the 
design without necessarily having to build a running 
model. 

Thimbleby (2004) uses matrices to model user 
interfaces, and matrix algebra to reason about us- 
ability related properties of the models. Instead of 
using model checking to reason about the properties 
of finite state machines representing the user inter- 
faces, matrices are used to represent the transition 
relation of those finite state machines. The author 
reverse engineers the user interface of three handheld 
devices, and shows how they can be analyzed using 
matrix algebra. The author argues that the approach 
is simpler and requires less expertise than working 
with model checking or theorem proving tools. 

Finally, in all of the previously-mentioned ap- 
proaches, the main emphasis is in the applicability of 
the model checking technology to the task of analyz- 
ing properties of interactive system’s models. Little 
or no effort is devoted to making the approaches 
usable for a wider audience. Loer and Harrison have 
moved in that direction with IFADIS (Loer, 2003), a 
tool for the analysis of user interface models. IFADIS 
uses OF AN Statecharts as the modeling notation, 
and Statemate as the tool to edit the models. Models 
are than translated for verification in SMV. The 
tools provides an environment for modeling, defini- 
tion of properties, and analysis of the results of the 
verification process. 

FUTURE TRENDS 

With the exception of the IFADIS tool, work on the 
application of automated reasoning tools to the veri- 
fication of interactive systems has, so far, attempted 
mainly to prove the technical viability of the ap- 
proaches. A large number of design notations and 
tools for model checking have been proposed. De- 
veloping an understanding of usability and verifica- 
tion technology to go with it is not enough to guaran- 
tee a useful and usable approach to verification. 
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Interactive systems have specificities that make 
using typical model checking tools difficult. The 
richness of the interaction between user and system 
places specific demands on the types of models that 
are needed. The assumptions that must be made 
about the users’ capabilities affects how the models 
should be built and the analysis of the verification 
results. Tools are needed that support the designer/ 
analyst in modeling, expressing properties, and rea- 
soning about the results of the verification from an 
interactive systems perspective. 

A possibility for this is to build layers on top of 
existing verification tools, so that the concepts in- 
volved in the verification of usability-related proper- 
ties are made more easily expressed. The use of 
graphical notations might be a useful possibility. The 
area opens several lines of work. One is research on 
interfaces for the verification tools. The STeP prover, 
for example, uses diagrams to represent proof strate- 
gies. Another is the need to support the formulation 
of properties. Where properties are expressed as 
goals, maybe graphical representation of start and 
target interface states could be used. 

Another area that needs further research is that 
of including models of users, work, and context of 
use in the verification process. Some work has 
already been done, but the models used are typically 
very simple. Increasing the complexity of the mod- 
els, however, means larger state spaces which in 
turn make verification more difficult. 

Finally, some work has already been done on 
reverse engineering of user interface code. This 
area deserves further attention. It can help in the 
analysis of existing systems, it can help verify imple- 
mentations against properties already proved of the 
models, or it can help cut down on the cost of building 
the models during development. 

CONCLUSION 

This article has reviewed the application of auto- 
mated reasoning tools to the usability analysis of 
interactive systems. Reasoning about usability is a 
difficult task whatever the approach used. The 
application of automated reasoning to this field still 
has a long way to go. 

Early approaches were based around very simple 
models of the interactive system. As more complex 



models started being considered, recognition grew 
of the need to include considerations about the user 
or context of usage in the verification process. Some 
authors did this directly in the models; others en- 
coded that information in the properties to be proved. 
More recently, the need has been recognized for 
better tool support, and some steps have been given 
in that direction. 

Applying automated reasoning tools will always 
mean incurring in the costs of developing adequate 
models and having adequate expertise. Tool support 
should help decrease these costs. Increased recog- 
nition of the relevance of good usable designs, 
especially when considering safety-critical and mass 
market systems, should help make the remaining 
cost more acceptable. 
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KEY TERMS 

Automated Theorem Prover: A software tool 
that (semi-)automatically performs mathematical 
proofs. Available theorem provers range from fully 
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interactive tools to provers that, given a proof, check 
if the proof is correct with no further interaction 
from the user. 

Cognitive Walkthrough: A model-based tech- 
nique for evaluation of interactive systems designs. 
It is particularly suited for “walk up and use” inter- 
faces such as electronic kiosks or ATMs. Its aim is 
to analyze how well the interface will guide first time 
or infrequent users in performing tasks. Analysis is 
performed by asking three questions at each stage of 
the interaction: Will the correct action he made 
sufficiently evident to users?', Will users connect 
the correct action ’s description with what they 
are trying to achieve?', and Will users interpret 
the system ’s response to the chosen action cor- 
rectly? 

DSV-IS (Design, Specification and Verifica- 
tion of Interactive Systems): An annual interna- 
tional workshop on user interfaces and software 
engineering. The first DSV-IS workshop was held in 
1994 in Carrara (Italy). The focus of this workshop 
series ranges from the pure theoretical aspects to 
the techniques and tools for the design, development, 
and validation of interactive systems. 

Heuristic Evaluation: A technique for early 
evaluation of interactive systems designs. Heuristic 
evaluation involves systematic inspection of the 
design by means of broad guidelines for good prac- 
tice. Typically, 3 to 5 experts should perform the 
analysis independently, and afterwards combine and 
rank the results. A well-known set of heuristics is 
the one proposed by Nielsen: visibility of system 
status; match between the system and the real 
world; user control and freedom; consistency 
and standards; error prevention; recognition 
rather than recall; flexibility and efficiency of 
use; aesthetic and minimalist design; help users 
recognize, diagnose, and recover from errors; 
help and documentation. 

IFADIS (Integrated Framework for the 
Analysis of Dependable Interactive Systems): 

A tool for the analysis of user interface models 
developed at the University of York (UK). 

Lesar: A Lustre model checker developed a 
Verimag (France). 



Lite (LOTOS Integrated Tool Environ- 
ment): An integrated tool environment for working 
with LOTOS specifications. It provides specifica- 
tion, verification/validation, and implementation sup- 
port. The tools in LITE have been developed by 
participants in the LOTOSPHERE project (funded 
by the Commission of the European Community 
ESPRIT II programme). 

LOTOS (Language of Temporal Ordering 
Specifications): A formal specification language 
for specifying concurrent and distributed systems. 
LOTOS’ syntax and semantics is defined by ISO 
standard 8807:1989. LOTOS has been used, for 
example, to specify the Open Systems Interconnec- 
tion (OSI) architecture (ISO 7498). 

Lustre: A synchronous data-flow language for 
programming reactive systems. Lustre is the kernel 
language of the SCADE industrial environment de- 
veloped for critical real-time software design by CS 
Verilog (France). 

Modal User Interface: A user interface is said 
to be modal (or to have modes) when the same user 
action will be interpreted by the system differently 
depending on the system’s state and/or the output of 
the system means different things depending on 
system state. 

Mode Error: A mode error happens when the 
user misinterprets the mode the system is in. In this 
situation, actions by the user will be interpreted by 
the system in a way which will not be what the user 
is expecting, and/or the user will interpret the infor- 
mation provided by the system erroneously. Mode 
error typically leads to the user being confounded by 
the behavior of the system. 

Model Checker: A tool that automatically 
checks a temporal logic formula against a state 
machine. In the case of symbolic model checking, 
the tool does not handle the states in the state 
machine directly. Instead, it handles terms that 
define sets of states. In this way, it is possible to 
work with much larger state machines since it is not 
necessary to explicitly build it. 

OF AN: A Statecharts’ based task modelling 
framework developed at the Georgia Institute of 
Technology (USA). 
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SMV (Symbolic Model Verifier): A symbolic 
model checker originally developed at Carnegie 
Mellon University (USA). Currently, two versions 
exist: Cadence SMV, being developed by Cadence 
Berkeley Laboratories (USA) as a research plat- 
form for new algorithms and methodologies to incor- 
porate into their commercial products, andNuSMV, 
a re-implementation and extension of the original 
tool being developed as a joint project between the 
ITC- IRST (Italy), Carnegie Mellon University 
(USA), the University of Genova (Italy), and the 
University of Trento (Italy). 

The STeP Prover (Stanford Temporal 
Prover): A tool to support the formal verification of 
reactive, real-time, and hybrid systems. SteP com- 
bines model checking with deductive methods to 
allow for the verification of a broader class of 
systems. 



Task Model: A description of how the system 
is supposed to be used to achieve pre-defined goals. 
Task models are usually defined in terms of the 
actions that must be carried out to achieve a goal. 

UIL (User Interface Language): A language 
for specifying user interfaces in Motif, the industry 
standard graphical user interfaces toolkit for UNIX 
systems (as defined by the IEEE 1295 specifica- 
tion). 

Usability: The ISO 9241-11 standard defines 
usability as “the extent to which a product can be 
used by specified users to achieve specified goals 
with effectiveness, efficiency, and satisfaction in a 
specified context of use.” 

User Model: A model that captures information 
about users. User models range from simple collec- 
tions of information about users to cognitive archi- 
tectures that attempt to simulate user behavior. 
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INTRODUCTION 

Empirical methods in human-computer interaction 
(HCI) are very expensive, and the large number of 
information systems on the Internet requires great 
efforts for their evaluation. Automatic methods try 
to evaluate the quality of Web pages without human 
intervention in order to reduce the cost for evalua- 
tion. However, automatic evaluation of an interface 
cannot replace usability testing and other elaborated 
methods. 

Many definitions for the quality of information 
products are discussed in the literature. The user 
interface and the content are inseparable on the 
Web, and as a consequence, their evaluation cannot 
always be separated easily. Thus, content and inter- 
face are usually considered as two aspects of quality 
and are assessed together. A helpful quality defini- 
tion in this context is provided by Huang, Lee, and 
Wang (1999). It is shown in Table 1. 



The general definition of quality above contains 
several aspects that deal with human-computer in- 
teraction. For example, the importance of accessibil- 
ity is stressed. The user and context are important in 
human-computer interaction, and the information- 
quality definition also considers suitability for the 
context as a major dimension. 

The automatic assessment of the quality of 
Internet pages has been an emerging field of re- 
search in the last few years. Several approaches 
have been proposed under various names. Simple 
approaches try to assess the quality of interfaces via 
the technological soundness of an implementation, 
or they measure the popularity of a Web page by link 
analysis. Another direction of research is also based 
on only one feature and considers the quality of free 
text. More advanced approaches combine evidence 
for assessing the quality of an interface on the Web. 
Table 2 shows the main approaches and the disci- 
pline from which they originated. 



Table 1. Categories of information quality (IQ) (Huang et al., 1999) 



IQ Category 


IQ Dimensions 


Intrinsic IQ 


Accuracy, objectivity, believability, reputation 


Contextual IQ 


Relevancy, value-added, timeliness, 
completeness, amount of information 


Representational 

IQ 


Interpretability, ease of understanding, concise 
representation, consistent representation 


Accessibility IQ 


Access, security 



Table 2. Disciplines and their approaches to automatic quality evaluation 



Approach 


Discipline 


HTML Syntax checking 


Web software engineering 


Link analysis 


Web information retrieval 


Indicators for content quality 


Library and information science 


Interface evaluation 


HCI 


Text quality 


Human language technology 
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These approaches are discussed in the main 
sections. The indicators for content quality have not 
resulted in many implementations and are presented 
together with the interface evaluation in the subsec- 
tion “Page and Navigation Structure.” 

BACKGROUND 

In the past, mainly two directions of research have 
contributed to establish the automatic evaluation of 
Internet resources: bibliometrics and software test- 
ing. 

Link analysis applies well-known measures from 
bibliometrics to the Web. The number of references 
to a scientific paper has been used as an indicator for 
its quality. For the Web, the number of links to a Web 
page is used as the main indicator for the quality of 
that page (Choo, Detlor, & Turnbull, 2000). Mean- 
while, the availability of many papers online and 
some technical advancement have made bibliometric 
systems for scientific literature available on the 
Internet (Lawrence, Giles, & Bollacker, 1999). The 
availability of such measures will eventually lead to 
an even greater importance and impact of quantita- 
tive evaluation. 

Software testing has become an important chal- 
lenge since software gets more and more complex. 
In software engineering, automatic software testing 
has attracted considerable research. The success of 
the Internet has led to the creation of testing tools for 
standard Internet languages. 

MAIN ISSUES IN THE AUTOMATIC 
EVALUATION OF INTERFACES ON 
THE INTERNET 

HTML Syntax Checking 

Syntax-checking programs have been developed for 
programming languages and markup languages. 
Syntax checkers for HTML and other Web stan- 
dards analyze the quality of Web pages from the 
perspective of software engineering. However, some 
systems also consider aspects of human-computer 
interaction. 



One of the first tools for the evaluation of HTML 
pages was Weblint (Bowers, 1996). It is a typical 
system for syntax checking and operates on the 
following levels. 

• Syntax (Are all open tags closed? Are lan- 
guage elements used syntactically correct?) 

• HTML use (Is the sequence of the headings 
consistent?) 

• Structure of a site (Are there links that lead one 
hierarchy level up?) 

• Portability (Can all expressions be interpreted 
correctly by all browsers?) 

• Stylistic problems (Is alternative text provided 
for graphics? Do words like here appear in link 
text?) 

The examples also illustrate how syntax-check- 
ing programs are related to human-computer inter- 
action. Some rules cover only the syntactical cor- 
rectness. Others address the user experience for a 
page. For example, missing alternative text for im- 
ages poses no syntax problem, but it may annoy 
users of slow-loading pages. In their generality, 
these simple rules do not apply for each context. For 
instance, a link upward may not be useful for 
nonhierarchical sites. 

A more comprehensive system than Weblint is 
available from the National Institute of Standards 
and Technology (NIST, http://zing.nscl.nist.gov/ 
WebTools/). Its system WebSAT is part of the Suite 
Web Metrics. WebSAT is based on guidelines from 
the IEEE and checks whether tags for visually 
impaired users are present. It also tests whether 
forms are used correctly and whether the relation 
between links and text promises good readability. 

An overview of systems for code checking is 
provided by Brajnik (2000). Such systems can obvi- 
ously lead only to a limited definition of quality. 
However, they will certainly be part of more com- 
plex systems in the future. 

Link Analysis 

One of the main challenges for search engines is the 
heterogeneous quality of Internet pages concerning 
both content quality and interface design (Henzinger, 
Motwani, & Silverstein, 2002). In Web information 
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retrieval, the common approach to automatically 
measure the quality of a page has been link analysis. 
However, link analysis is a heuristic method. The 
number of links pointing to a page is considered as the 
main quality indicator (Brin & Page, 1998). From the 
perspective of human-computer interaction, it is as- 
sumed that authors of Web pages prefer to link to 
pages that are easy to use for them. 

A large variety of algorithms for link analysis has 
been developed (Henzinger, 2000). The most well- 
known ones are probably the PageRank algorithm 
and its variants (Haveliwala, 2002; Jeh & Widom, 
2003). The basic assumption of PageRank and simi- 
lar approaches is that the number of in- or back-links 
of a Web page can be used as a measure for the 
authority and consequently for the quality of a page 
including its usability. PageRank assigns an authority 
value to each W eb page that is primarily a function of 
its back- links. Additionally, it assumes that links from 
pages with high quality should be weighed higher and 
should result in a higher quality for the receiving 
page. To account for the different values each page 
has to distribute, the algorithm is carried out itera- 
tively until the result converges. PageRank may also 
be interpreted as an iterative matrix operation 
(Meghabghab, 2002). 

As mentioned above, link analysis has its histori- 
cal roots in the bibliometric analysis of research 
literature. Meanwhile, link analysis is also applied to 
measure the quality of research institutes. For ex- 
ample, one study investigates the relationship be- 
tween the scientific excellence of universities and 
the number of in-links of the corresponding university 
pages (Thelwall & Harries, 2003). 

The Kleinberg algorithm (1998) is a predecessor 
of PageRank and works similarly. It assigns two 
types of values to the pages. Apart from the authority 
value, it introduces a so-called hub value. The hub 
value represents the authority as an information 
intermediate. The Kleinberg algorithm assumes that 
there are two types of pages: content pages and link 
pages. The hub value or information-provider quality 
is high when the page refers to many pages with high 
authority. Accordingly, the topical authority is in- 
creased when a page receives links from highly rated 
hubs (Kleinberg, 1998). Unlike PageRank, which is 
intended to work for all pages encountered by a Web 
spider of a search engine, the Kleinberg algorithm 



was originally designed to work on the expanded 
result set of a search engine. 

Link analysis has been widely applied; however, 
it has several serious shortcomings. The assign- 
ment of links is a social process leading to remark- 
able stable patterns. The number of in-links for a 
Web page follows apower law distribution (Adamic 
& Huberman, 2001). For such a distribution, the 
median value is much lower than the average. That 
means, many pages have few in-links while few 
pages have an extremely high number of in-links. 
This finding indicates that Web-page authors choose 
the Web sites they link to without a thorough quality 
evaluation. Rather, they act according to economic 
principles and invest as little time as possible for 
their selection. As a consequence, social actors in 
networks rely on the preferences of other actors 
(Barabasi, 2002). Pages with a high in-link degree 
are more likely to receive further in-links than other 
pages (Pennock, Flake, Lawrence, Glover, & Giles, 
2002). Another reason for setting links is thematic 
similarity (Chakrabarti, Joshi, Punera, & Pennock, 
2002). Definitely, quality assessment is not the only 
reason for setting a link. 

A study of university sites questions the assump- 
tion that quality is associated with high in-link 
counts. It was shown that the links from university 
sites do not even lead to pages with scientific 
material in most cases. They rather refer the user 
to link collections and subject-specific resources 
(Thelwall & Harries, 2003). 

Large-scale evaluation of Web information re- 
trieval has been carried out within TREC (Text 
Retrieval Conference, http://trec.nist.gov; Hawk- 
ing & Craswell, 2001). TREC provides a test bed 
for information-retrieval experiments. The annual 
event is organized by NIST, which maintains a large 
collection of documents. Research groups apply 
their retrieval engines to this corpus and optimize 
them with results from previous years. They submit 
their results to NIST, where the relevance of the 
retrieved documents is intellectually assessed. A 
few years ago, a Web track was introduced in 
which a large snapshot of the Web forms the 
document collection. In this context, link-based 
measures have been compared with standard re- 
trieval rankings. The results show that the consid- 
eration of link structure does not lead to better 
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retrieval performance. This result was observed in 
the Web track at TREC 2001. Only for the home- 
page-finding task have link-based authority mea- 
sures like the PageRank algorithm led to an improve- 
ment (Hawking & Craswell, 2001). 

Link analysis is the approach that has attracted 
the most research within automatic quality evalua- 
tion. However, it has its shortcomings, and good 
usability is only one reason to set a link. As a 
consequence, link-analysis measures should not be 
used as the only indicator for the quality of the user 
interface. 

Quality of Text 

If systems for assessing the quality of text succeed, 
they will play an important role for human-computer 
interaction. The readability of text is an important 
issue for the usability of a page. Good readability 
leads to fast and more satisfying interaction. The 
prototypes developed so far are focused on the 
application of teacher assistance for essay grading. 
However, the use of such systems for Web re- 
sources will soon be debated. The automatic evalu- 
ation of the quality of text certainly poses many 
ethical questions and will raise a lot of debate once 
it is implemented on a larger scale. 

Two approaches are used in prototypes for the 
automatic evaluation of texts. The first approach is 
to measure the coherence of a text and use it as a 
yardstick. The second typical approach is to calcu- 
late the similarity of the texts to sample texts that 
have been evaluated and graded by humans. The 
new texts are then graded according to their similar- 
ity to the already graded texts. 

The Intelligent Essay Assessor is based on latent 
semantic indexing (LSI; Foltz, Laham, & Landauer, 
1999). LSI is a reduction technique. In text analysis, 
usually each word is used for the semantic descrip- 
tion of the text, resulting in a large number of 
descriptive elements. LSI analyzes the dependen- 
cies between these dimensions and creates a re- 
duced set of artificial semantic dimensions. The 
Intelligent Essay Assessor considers a set of essays 
graded by humans. For each essay, it calculates the 
similarity of the essay to the graded essays using 
text-classification methods. The grade of the most 
similar cluster is then assigned to the essay. For 



1 ,200 essays, the system reached a correlation of 0.7 
to the grades assigned by humans. The correlation 
between two humans was not higher than that. 

A similar level of performance was reached by a 
system by Larkey (1998). This prototype applies the 
Bayesian Ai-nearest neighbor classifier and does not 
carry out any reduction of the word dimensions 
(Larkey). 

The readability of text depends to a certain extent 
on the coherence within the text and between its 
parts and sentences. In another experiment, LSI 
was used to measure the similarity between follow- 
ing sentences in a text. The average similarity 
between all sentence pairs determines the coher- 
ence of the text. This value was compared to the 
typical similarity in a large corpus of sentences. The 
results obtained are coherent with psychological 
experiments estimating the readability (Larkey, 
1998). 

In Web pages, some text elements are more 
important than others for navigation. The text on 
interaction elements plays a crucial role for the usabil- 
ity. It is often short and cannot rely as much on context 
as phrases within a longer text. These text elements 
need special attention. In one experiment, similar 
measures as presented above were used to determine 
the coherence between link text and the content of the 
pages to which the link points (Chi et al. , 2003) . These 
measures are likely to be applied to Web resources 
more extensively within the next few years. 

PAGE AND NAVIGATION 
STRUCTURE 

Advanced quality models that take several aspects 
into account are still at an experimental stage. 
However, they are the most relevant for human- 
computer interaction. These systems go beyond 
syntax checking and analyze the design of pages by 
extracting features from the HTML code of pages. 
Prototypes consider the use of interaction elements 
and design, for example, by looking for a balanced 
layout. 

Whereas the first approaches evaluated the fea- 
tures of Web pages intellectually (Bucy, Lang, Pot- 
ter, & Grabe, 1999), features are now extracted 
more and more automatically. Most of these proto- 
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types are based on intellectual quality assessment of 
pages and use these decisions as a yardstick or 
training set for their algorithms. 

An experiment carried out by Amento, Terveen, 
and Hill (2000) suggests that the human perception 
of the quality of Web pages can be predicted equally 
well by four formal features. These four features 
include link-analysis measures like the PageRank 
value and the total number of in-links. However, 
simple features like the number of pages on a site 
and the number of graphics on a page also correlated 
highly with the human judgments (Amento et al). 

The system WebTango extracts more than 150 
atomic and simple features from a Web page and 
tries to reveal statistical correlations to a set of sites 
rated as excellent (Ivory & Hearst, 2002). The 
extracted features are based on the design, the 
structure, and the HTML code of a page. WebTango 
includes the ratings of the Weblint system discussed 
above. The definition of the features is based on 
hypotheses on the effect of certain design elements 
on usability. As a consequence, the approach is 
restricted by the validity of these assumptions. The 
human ratings could be reproduced to a great extent 
by the statistical approach. 

The information scent analysis in the Bloodhound 
project uses link analysis to compare the anchor text 
and its surroundings with the information on the 
linked page (Chi et al., 2003). The system simulates 
log files by automatically navigating through a site 
and determines the quality measure of the site as the 
average similarity between link text and the text on 
the following link. The integration with log files 
shows the interaction focus of the approach. Link 
analysis has also been combined with log analysis. 
For example, the approach called usage-aware page 
rank assigns a bias for pages often accessed (Oztekin, 
Ertoz, & Kumar, 2003). Usage-aware page rank and 
Bloodhound are limited to one site. 

When comparing the systems, it can be observed 
that no consensus has been reached yet about which 
features of Web pages are important for the quality 
decisions of humans. Much more empirical research 
is necessary in order to identify the most relevant 
factors. 



FUTURE TRENDS 

The literature discussed in the main section shows 
a clear trend from systems with objective defini- 
tions of quality toward systems that try to capture 
the subjective perspective of individual users. The 
most recent approaches rely mainly on statistical 
approaches to extract quality definitions from al- 
ready assessed resources and apply them to new 
pages. Future systems will also rely on more and 
more criteria in order to provide better quality 
decisions. 

Web log files representing the actual information 
behavior may play a stronger role in these future 
systems. The assessment of the quality of texts will 
probably gain more importance for Internet re- 
sources. 

Quality will also be assessed differently for dif- 
ferent domains, for different types of pages, and for 
various usability aspects. As individualization is an 
important issue in Web information systems, efforts 
to integrate personal viewpoints on quality into qual- 
ity-assessment systems will be undertaken. 

Automatic evaluation will never replace other 
evaluation methods like user tests. However, they 
will be applied in situations where a large number of 
Internet interfaces need to be evaluated. 



CONCLUSION 

The automatic evaluation of Internet resources is a 
novel research field that has not reached maturity 
yet. So far, each of the different approaches is 
deeply rooted within its own discipline and relies on 
a small number of criteria in order to measure a 
limited aspect of quality. In the future, systems will 
need to integrate several of the current approaches 
in order to achieve quality metrics that are helpful for 
the user. These systems will contribute substantially 
to the field of human-computer interaction. 

Automatic evaluation allows users and develop- 
ers to evaluate many interfaces according to their 
usability. Systems for automatic evaluation will help 
to identify well-designed interfaces as well as good 
interface elements. The evaluation of a large num- 
ber of Internet interfaces will reveal trends in inter- 
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face design and identify design elements often and 
successfully used. 
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KEY TERMS 

Authority: Link analysis considers Web pages 
of high quality to be authorities for their topic. That 
means these pages contain the best, most convinc- 
ing, most comprehensive and objective information 
for that topic. 

Bibliometrics: Bibliometrics studies the rela- 
tionship amongst scientific publications. The most 



important application is the calculation of impact 
factors for publications. During this process, a high 
number of references is considered to be an indica- 
tor for high scientific quality. Other analyses include 
the structure and the development of scientific com- 
munities. 

Hub: Hub is a term for Web pages in link 
analysis. In contrast to an authority page, a hub page 
does not contain high-quality content itself, but links 
to the authorities. A hub represents an excellent 
information provider and may be a clearinghouse or 
a link collection. The high quality of these pages is 
shown by the information sources they contain. 

Information Retrieval: Information retrieval is 
concerned with the representation of knowledge and 
subsequent search for relevant information within 
these knowledge sources. Information retrieval pro- 
vides the technology behind search engines. 

Latent Semantic Indexing: Latent semantic 
indexing is a dimension-reduction technique used in 
information retrieval. During the analysis of natural- 
language text, each word is usually used for the 
semantic representation. As a result, a large number 
of words describe a text. Latent semantic indexing 
combines many features and finds a smaller set of 
dimensions for the representation that describes 
approximately the same content. 

Link Analysis: The links between pages on the 
Web are a large knowledge source that is exploited 
by link-analysis algorithms for many ends. Many 
algorithms similar to PageRank determine a quality 
or authority score based on the number of incoming 
links of a page. Furthermore, link analysis is applied 
to identify thematically similar pages, Web commu- 
nities, and other social structures. 

PageRank: The PageRank algorithm assigns a 
quality value to each known Web page that is 
integrated into the ranking of search-engine results. 
This quality value is based on the number of links that 
point to a page. In an iterative algorithm, the links 
from high-quality pages are weighted higher than 
links from other pages. PageRank was originally 
developed for the Google search engine. 
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INTRODUCTION 

Facial expression analysis is an active area in hu- 
man-computer interaction. Many techniques of fa- 
cial expression analysis have been proposed that try 
to make the interaction tighter and more efficient. 

The essence of facial expression analysis is to 
recognize facial actions or to perceive human emo- 
tion through the changes of the face surface. Gen- 
erally, there are three main steps in analyzing facial 
expression. First, the face should be detected in the 
image or the first frame of image sequences. Sec- 
ond, the representation of facial expression should 
be determined, and the data related to facial expres- 
sion should be extracted from the image or the 
following image sequences. Finally, a mechanism of 
classification should be devised to classify the facial 
expression data. 

In this article, the techniques for automatic facial 
expression analysis will be discussed. The attempt is 
to classify various methods to some categories 
instead of giving an exhausted review. 

The rest of this article is organized as follows. 
Background is presented briefly firstly. The tech- 
niques used in the three steps, which are detecting 
the face, representing the facial expression, and 
classifying the facial expression, are described re- 
spectively. Then some facial expression databases 
are discussed. The challenges and future trends to 
facial expression analysis are also presented. Fi- 
nally, the conclusion is made. 

BACKGROUND 

During the past decade, the development of image 
analysis, object tracking, pattern recognition, com- 
puter vision, and computer hardware has brought 



facial expression into human-computer interaction 
as a new modality, and makes the interaction tighter 
and more efficient. Many systems for automatic 
facial expression have been developed since the 
pioneering work of Mase and Pentland (1991). 
Some surveys of automatic facial expression analy- 
sis (Fasel & Luettin, 2003; Pantic & Rothkrantz, 
2000a) have also appeared. 

Various applications using automatic facial ex- 
pression analysis can be envisaged in the near 
future, fostering further interest in doing research in 
different areas (Fasel & Luettin, 2003). However, 
there are still many challenges to develop an ideal 
automatic facial expression analysis system. 

DETECTING FACE 

Before dealing with the information of facial expres- 
sion, the face should be located in images or se- 
quences. Given an arbitrary image, the goal of face 
detection is to determine whether or not there are 
faces in the image, and if present, return the location 
and extent of each face. Two good surveys of face 
detection have been published recently (Hjelmas, 
2001; Yang, Kriegman, & Ahuja, 2002). 

In most of the systems for facial expression 
analysis, it is assumed that only one face is contained 
in the image and the face is near the front view. 
Then, the main aim of this step is to locate the face 
and facial features. 

In face detection, the input can be either a static 
image or an image sequence. Because the methods 
are totally different, we discuss them separately in 
the following paragraphs. 

The techniques of face detection from static 
images can be classified into four categories (Yang 
et al., 2002), although some methods clearly overlap 
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the category boundaries. These four types of tech- 
niques are listed as follows: 

• Knowledge-Based Methods: These rule- 
based methods encode human knowledge about 
what constitutes a typical face. Usually, the 
rules capture the relationships between facial 
features. These methods are designed mainly 
for face localization. 

• Feature-Invariant Approaches: These al- 
gorithms aim to find structural features that 
exist even when the pose, viewpoint, or lighting 
conditions vary, and then use these features to 
locate faces. Usually, the facial features, such 
as the edge of the eye and mouth, texture, skin 
color, and the integration of these features, are 
used to locate faces. These methods are de- 
signed mainly for face localization. 

• Template-Matching Methods: Several stan- 
dard patterns of faces are stored to describe 
the face as a whole or the facial features 
separately. The correlations between an input 
image and the stored patterns are computed for 
detection. Usually, predefined face templates 
and deformable templates are used. These 
methods have been used for both face localiza- 
tion and detection. 

• Appearance-Based Methods: In contrast to 
template matching, models (or templates) are 
learned from a set of training images that 
should capture the representative variability of 
facial appearance. These learned models are 
then used for detection. Many learning models 
are studied, such as eigenface, the distribution- 
based method, neural networks, support vector 
machines, the Naive Bayes classifier, the hid- 
den Markov model, the information-theoretical 
approach, and so forth. These methods are 
designed mainly for face detection. 

The face can also be detected by the motion 
information from image sequences. The approaches 
based on image sequences attempt to find the invari- 
ant features through face or head motion. They can 
be classified into two categories: 

• Accumulated Frame Difference: In this type 
of approach, moving silhouettes (candidates) 



that include facial features, a face, or body 
parts are extracted by thresholding the accu- 
mulated frame difference. Then, some rules 
are set to measure the candidates. These ap- 
proaches are straightforward and easy to real- 
ize. However, they are not robust enough to 
detect noise and insignificant motion. 

• Moving Image Contour: In the approach, 
the motion is measured through the estimation 
of moving contours, such as optical flow. Com- 
pared to frame difference, results from moving 
contours are always more reliable, especially 
when the motion is insignificant (Hjelmas, 200 1 ). 

REPRESENTING FACIAL 
EXPRESSION 

After determining the location of the face, the 
information of the facial expression can be ex- 
tracted. In this step, the fundamental issue is how to 
represent the information of facial expression from 
a static image or an image sequence. 

Benefiting from the development of image analy- 
sis, object and face tracking, and face recognition, 
many approaches have been proposed to represent 
the information of facial expression. These methods 
can be classified into different classes according to 
different criteria. Five kinds of approaches are 
discussed as follows. 

According to the type of input, the approaches 
for representing facial expression can be classified 
into two categories: 

• Static-Image-Based Approaches: The sys- 
tem analyzes the facial expression in static 
images. Typically, a neutral expression is needed 
to find the changes caused by facial expres- 
sions (Buciu, Kotropoulos, & Pitas, 2003; Chen 
& Huang, 2003; Gao, Leung, Hui, & Tanada, 
2003; Pantic & Rothkrantz, 2000b, 2004). 

• Image-Sequence-Based Approaches: The 
system attempts to extract the motion or 
changes of the face or facial features, and uses 
the spatial trajectories or spatiotemporal infor- 
mation to represent the facial expression infor- 
mation (Cohn, Sebe, Garg, Chen, & Huang, 
2003; Essa & Pentland, 1997; Tian, Kanade, & 
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Cohn, 2001; Zhang, 2003). Because facial ex- 
pression is a spatial-temporal phenomenon, it is 
reasonable to believe that the approaches that 
deal with the image sequences could obtain 
more reliable results. 

According to the space where the information is 
represented, the representations of facial expression 
information can be categorized into three types: 

• Spatial Space-Based Approaches: The de- 
formation or the differences between the neu- 
tral face or facial features and the current face 
or facial features is used to represent the facial 
expression information (Buciu et al., 2003; Chen 
& Huang, 2003; Gao et al., 2003; Pantic & 
Rothkrantz, 2004). The methods used in face 
recognition are usually adopted in this type of 
approaches. 

• Spatial Trajectory-Based Approaches: The 

spatial trajectory from the neutral face or facial 
features to the current face or facial features is 
used to represent the facial expression informa- 
tion (Tian et al., 2001). The fundamental issue 
of this kind of method is the motion of the face 
or facial features. 

• Spatiotemporal Trajectory-Based Ap- 
proaches: The temporal information is also 
used besides the spatial trajectory (Cohen et al., 
2003; Essa & Pentland, 1997). These methods 
are usually represented inside the spatiotempo- 
ral models, such as hidden Markov models. 

According to the regions of a face where the face 
is processed, the representations can be categorized 
into local approaches, holist approaches, or hybrid 
approaches: 

• Local Approaches: The face is processed by 
focusing on local facial features or local areas 
that are prone to change with facial expression 
(Pantic & Rothkrantz, 2000b, 2004; Tian et al., 
2001). In general, intransient facial features 
such as the eyes, eyebrows, mouth, and tissue 
texture, and the transient facial features such as 
wrinkles and bulges are mainly involved in fa- 
cial expression displays. 

• Holist Approaches: The face is processed as 
a whole to find the changes caused by facial 



expressions (Chen & Huang, 2003; Gao et al., 
2003). 

• Hybrid Approaches: Both the local facial 
features and the whole face are considered to 
represent the facial expression (Buciu et al., 
2003; Cohen et al., 2003; Essa & Pentland, 
1997; Lyons, Budynek, & Akamatsu, 1999). 
For example, a grid model of the whole face is 
used to represent the whole face, and the 
properties of local facial features are also 
used in the approach (Lyons et al.). 

According to information of the face, the ap- 
proaches can be classified into image-based, 2-D- 
model-based, and 3-D-model-based approaches. 

• Image-Based Approaches: The intensity 
information is used directly to represent the 
deformation or the motion of the face or facial 
features (Chen & Huang, 2003; Donato, 
Bartlett, Hager, Ekman, & Sejnowski, 1999). 

• 2-D-Model-Based Approaches: The face 
is described with the aid of a 2-D face model, 
including the facial features or the whole face 
region, without attempting to recover the volu- 
metric geometry of the scene (Buciu et al., 
2003; Cohen et al., 2003; Gao et al., 2003; 
Pantic & Rothkrantz, 2004; Tian et al., 2001). 

• 3-D-Model-Based Approaches: The face 
is described by a 3-D face model. These 
techniques have the advantage that they can 
be extremely accurate, but have the disadvan- 
tage that they are often slow, fragile, and 
usually must be trained by hand (Essa & 
Pentland, 1997). 

According to whether the representation is mod- 
eled on the face surface or not, methods can be 
classified into appearance-based and muscle-based 
approaches. 

• Appearance-Based Approaches: The fa- 
cial expressions are all represented by the 
appearance of the face or facial features, and 
the information extracted from the appear- 
ance is used to analyze the facial expression 
(Buciu et al., 2003; Chen & Huang, 2003; 
Cohen et al., 2003; Tian et al., 2001). 
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• Muscle-Based Approaches: The approaches 
focus on the effects of facial muscle activities 
and attempt to interfere with muscle activities 
from visual information (Essa & Pentland, 1997; 
Mase & Pentland, 1991). This may be achieved 
by using 3-D muscle models that allow mapping 
of the extracted optical flow into muscle ac- 
tions. The muscle-based approaches are able 
to easily synthesize facial expressions. How- 
ever, the relationships between the motion of 
the appearance and motion of the muscle are 
not so easily dealt with. 

After determining the representation of the facial 
expression, the information of the facial expression 
can be extracted according to the representation 
approach. In general, the methods of extracting 
facial expression information are determined by the 
type of facial expression representation. 

Each approach has its advantages and disadvan- 
tages. In a facial expression analysis system, the 
facial expression information is always represented 
using the combination of some types of facial ex- 
pression representation. 

Because facial expression is a spatial-temporal 
phenomenon, it is more reasonable to believe that the 
approaches that deal with image sequences and use 
the spatiotemporal trajectory-based approaches could 
obtain more reliable results. The facial expression is 
the effect of the whole face, and any small change 
of facial appearance means the change of facial 
expression. Though affected by accuracy and the 
time consumed, the approaches using hybrid ap- 
proaches or 3-D models to represent the face may 
be more promising. The muscle-based approaches 
seem to be sound; however, the relationship of the 
appearance and the motion of muscle is not so clear, 
and is affected by the development of the image 
processing. Most of the approaches that deal with 
the changes of appearance directly are more rea- 
sonable. 



CLASSIFYING FACIAL EXPRESSION 

After the information of facial expression is ob- 
tained, the next step of facial expression analysis is 
to classify the facial expression conveyed by the 



face. In this step, a set of categories should be 
defined first. Then a mechanism can be devised for 
classification. 

Facial Expression Categories 

There are many ways for defining the categories of 
facial expressions. In the area of facial expression 
analysis research, two ways for categorizing are 
usually used. One is the Facial Actions Coding 
System (FACS; Ekman & Friesen, 1978), and the 
other is prototypic emotional expressions (Ekman, 
1982). Usually, the process of classifying the facial 
expression into action units (AUs) in FACS is called 
facial expression recognition, and the process of 
classifying the facial expression into six basic 
prototypic emotions is called facial expression inter- 
pretation (Fasel & Fuettin, 2003). Some systems can 
classify the facial expression to either AUs (Pantic 
& Rothkrantz, 2004; Tian et al., 2001) or one of six 
basic emotions (Buciu et al., 2003; Chen & Huang, 
2003; Cohen et al., 2003; Gao et al., 2003). Some 
systems perform both (Essa & Pentland, 1997; 
Pantic & Rothkrantz, 2000b). 

Mechanism for Classification 

Fike other pattern-recognition approaches, the aim 
of facial expression analysis is to use the information 
or patterns extracted from an input image and clas- 
sify the input image to a predefined pattern. Gener- 
ally, rule-based methods and statistic-based meth- 
ods are applied to classify facial expressions. 

• Rule-Based Methods: In this type of method, 
rules or facial expression dictionaries are de- 
termined first by human knowledge, then the 
examined facial expression is classified by the 
rules or dictionaries (Pantic & Rothkrantz, 
2000b, 2004). 

• Statistic-Based Methods: Statistic-based 
methods are the most successful approaches in 
pattern recognition. Most of the approaches in 
the literature classify facial expressions using 
statistic-based methods (Buciu et al., 2003; 
Chen & Huang, 2003; Cohen et al., 2003; Essa 
& Pentland, 1997; Tian et al., 2001). Since the 
early 1980s, statistic pattern recognition has 
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experienced rapid growth, especially in the 
increasing interaction and collaboration among 
different disciplines, including neural networks, 
machine learning, statistics, mathematics, com- 
puter science, and biology. Each examined 
facial expression is represented as a point in an 
^-dimensional feature space according to the 
representation of the facial expression infor- 
mation. Then, given a set of training patterns 
from each class, the objective is to establish 
decision boundaries in the feature space that 
separate patterns belonging to different classes. 
Of course, according to the learning methods, 
the statistic-based methods could be classified 
into many subcategories. For more details, 
refer to the papers of Fasel and Fuettin (2003), 
Jain, Duin, and Mao (2000), and Pantic and 
Rothkrantz (2000a). 

FACIAL EXPRESSION DATABASE 

Many facial expression analysis systems have been 
developed. Without a uniform facial expression da- 
tabase, these systems could not be compared. Some 
considerations for a facial expression database are 
the following (Kanada, Cohn, & Tian, 2000). 

1 . Fevel of description 

2. Transitions among expressions 

3. Deliberate vs. spontaneous expression 

4. Reliability of expression data 

5. Individual difference among subjects 

6. Head orientation and scene complexity 

7. Image acquisition and resolution 

8. Relation to nonfacial behavior 

Several databases have been developed to evalu- 
ate facial expression analysis systems. However, 
most of them are not open. A few databases that can 
be obtained freely or by purchase are as follows: 

• CMU-Pittsburgh AU-Coded Face Expres- 
sion Image Database (Kanada et al., 2000): 

The database is the most comprehensive test 
bed to date for comparative studies of facial 
expression analysis. It includes 2,105 digitized 
image sequences from 182 adult subjects of 



varying ethnicity and performs multiple tokens 
of most primary FACS action units. 

• CMU AMP Face Expression Database: 
There are 13 subjects in this database, each 
with 75 images, and all of the face images are 
collected in the same lighting condition. It only 
allows human expression changes. 

• The Japanese Female Facial Expression 
(JAFFE) Database: The database contains 
213 images of seven facial expressions (six 
basic facial expressions and one neutral) posed 
by 10 Japanese female models. Each image 
has been rated on six emotion adjectives by 60 
Japanese subjects. The database was planned 
and assembled by Miyuki Kamachi, Michael 
Fyons, and Jiro Gyoba (Fyons et al., 1999). 

• Japanese and Caucasian Facial Expres- 
sion of Emotion (JACFEE) and Neutral 
Faces: The database consists of two sets of 
photographs of facial expression. JACFEE 
shows 56 different people, half male and half 
female, and half Caucasian and half Japanese. 
The photos are in color and illustrate each of 
seven different emotions. JACNEUF shows 
the same 56 subjects with neutral faces. 

FUTURE TRENDS AND 
CHALLENGES 

In spite of the applications, we think the future trends 
would deal with the problems mentioned above and 
try to establish an ideal facial expression analysis 
system. Of course, the development of facial ex- 
pression analysis would benefit from other areas, 
including computer vision, pattern recognition, psy- 
chological studies, and so forth. 

The challenge to facial expression analysis can 
be outlined by an ideal facial expression system, 
which is proposed to direct the development of facial 
expression analysis systems, as shown in Table 1. 

Many researchers have attempted to solve the 
challenges mentioned above. However, in currently 
existing systems of facial expression analysis, few 
of them did. 

In many systems, strong assumptions are made in 
each step to make the problem of facial expression 
analysis more tractable. Some common assumptions 
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Table 1. Challenges to facial expression analysis 



For all steps 


Automatic processing 


Real-time processing 


Detecting face 


Deal with subjects of any age, ethnicity and outlook 


Deals with variation in imaging conditions, such as lighting and 
camera characteristics 


Deal with variation in pose or head motion 


Deal with various facial expressions 


Deals with partially occluded faces 


Representing facial 
expression 


Deal with both static images and image sequences 


Deal with various size and orientation of the face in input image 


Deal with various pose of face and head motion 


Deal with occluded face/facial features 


Classifying facial 
expression 


Distinguishes all possible expressions and their combinations 


Distinguishes unlimited interpretation categories 


Quantifies facial expressions or facial actions 


Deal with inaccurate facial expression data 


Deal with unilateral facial changes 


Features adaptive learning facility 



are the use of a frontal facial view, constant illumi- 
nation, a fixed light source, no facial hair or glasses, 
and the same ethnicity. Only the method proposed by 
Essa and Pentland ( 1 997) in the literature deals with 
subjects of any age and outlook. In representing 
facial expression, it is always assumed that the 
observed subject is immovable. 

None of the methods could distinguish all 44 
facial actions and their combinations. The method 
developed by Pantic and Rothkrantz (2004) deals 
with the most classes, that is, only 32 facial actions 
occurring alone or in combination, achieving an 86% 
recognition rate. 

Though some methods claim that they can deal 
with the six basic emotion categories and neutral 
expression (Cohen et al., 2003), it is not certain that 
all facial expressions displayed on a face can be 
classified under the six basic emotion categories in 
psychology. This makes the problem of facial ex- 
pression analysis even harder. 

CONCLUSION 

Facial expression is an important modality in human- 
computer interaction. The essence of facial expres- 
sion analysis is to recognize the facial actions of 
facial expression or perceive the human emotion 



through the changes of the face surface. In this 
article, three steps in automatic facial expression 
analysis, which are detecting the face, representing 
the facial expression, and classifying the facial 
expression, have been discussed. Because of the 
lack of a uniform facial expression database, the 
facial expression analysis systems are hard to evalu- 
ate. Though some developments have been achieved, 
many challenges still exist. 
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KEY TERMS 

Face Detection: Given an arbitrary image, the 
goal of face detection is to determine whether or not 
there are any faces in the image, and if present, 
return the image location and extent of each face. 

Face Localization: Given a facial image, the 
goal of face localization is to determine the position 
of a single face. This is a simplified detection prob- 
lem with the assumption that an input image contains 
only one face. 

Face Model Features: The features used to 
represent (model) the face or facial features, such 
as the width, height, and angle in a template of the 
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eye, or all nodes and triangles in a 3-D face mesh 
model. 

Facial Action Coding System (FACS): It is 

the most widely used and versatile method for 
measuring and describing facial behaviors, which 
was developed originally by Ekman and Friesen 
(1978) in the 1970s by determining how the contrac- 
tion of each facial muscle (singly and in combination 
with other muscles) changes the appearance of the 
face. 

Facial Expression Recognition: Classifying 
the facial expression to one facial action unit defined 
in FACS or a combination of action units, which is 
also called FACS encoding. 

Facial Expression Representation: Classify- 
ing the facial expression to one basic emotional 
category or a combination of categories. Often, six 



basic emotions defined by Ekman (1982) are used, 
which are happiness, sadness, surprise, fear, anger, 
and disgust. 

Facial Features: The prominent features of the 
face, which include intransient facial features, such 
as eyebrows, eyes, nose, mouth, chin, and so forth, 
and transient facial features, such as the regions 
surrounding the mouth and the eyes. 

Intransient Facial Features: The features that 
are always on the face, but may be deformed due to 
facial expressions, such as eyes, eyebrows, mouth, 
permanent furrows, and so forth. 

Transient Facial Features: Different kinds of 
wrinkles and bulges that occur with facial expres- 
sions, especially the forefront and the regions sur- 
rounding the mouth and the eyes. 
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INTRODUCTION 

The Development of Broadband 

Broadband commonly refers to Internet connection 
speeds greater than narrowband connection speed 
of 56kbs. Digital subscriber lines (DSL) and cable 
modems were the most popular forms of broadband 
in public use over the last 10 years. In 2004, over 
80% of U.S. homes were equipped with cable 
modems, and up to 66% of U.S. households were 
able to receive DSL transmissions. It is expected 
that the impact of broadband technologies will con- 
tinue to play an important role in the U.S . and the rest 
of the world. It is predicted that the number of 
broadband-enabled homes will exceed 90 million 
worldwide by 2007 (Jones, 2003). Canada and Ko- 
rea currently are the two countries leading the way 
in broadband saturation. The following discussion 
focuses on the Canadian case of broadband devel- 
opment. 

Canadian Broadband 

A bandwidth revolution is underway in Canada 
driven by an explosion in computing power and 
access to the world’s fastest research network. 
(Lawes, 2003, p. 19) 

As is the case almost everywhere, the develop- 
ment of broadband in Canada began with narrowband 
Internet. Canada’s main broadband initiative, 
CANARIE (Canadian Network for the Advance- 
ment of Research, Industry and Education), can be 
traced to regional-federal cooperative network prin- 
ciples established by NetNorth (forerunner to Ca*net) 
in the late 1980s and growing public and private 
sector interest in developing high-speed networks 
during the early 1990s (Shade, 1994). By 1993, 
CANARIE emerged as a not-for-profit federally 



incorporated organization consisting of public and 
private sector members. Its goal was to create a 
networking infrastructure that would enable Canada 
to take a leading role in the knowledge-based 
economy. The initial three-phase plan to be carried 
out within an eight-year period was expected to cost 
more than $1 billion with more than $200 million 
coming from the federal government. The objectives 
of the first phase were to promote network-based 
R&D, particularly in areas of product development, 
with expected gains in economic trade advance- 
ment. The objectives of the second phase were to 
extend the capabilities of CA*net to showcase new 
technology applications that advance educational 
communities, R&D, and public services. The objec- 
tive in the third phase were to develop a high-speed 
test network for developing products and services 
for competing internationally in a knowledge-based 
economy. CANARIE’ s overarching aim in the first 
three phases was to leverage Canada’s information 
technology and telecommunication capacities in or- 
der to advance the Canadian information economy 
and society. By the end of CANARIE’ s three 
phases, high-speed optical computing networking 
technology connected public and private institutions 
(i.e., universities, research institutes, businesses, 
government agencies and laboratories, museums, 
hospitals, and libraries, both nationally and interna- 
tionally) (Industry Canada, 2003). CANARIE’ s con- 
tribution to sustaining the Ca*net 4 broadband net- 
work (now in its fourth generation) made it possible 
for networks to share applications, computing power, 
and other digital resources nationwide and interna- 
tionally. 

CANARIE also provided funding for a number of 
organizations carrying out innovative initiatives re- 
quiring broadband technology, including Absolu Tech- 
nologies Inc., Shana Corporation, HyperCore Tech- 
nology Inc., Cifra Medical Inc., Broadband Net- 
works Inc., Callisto Media Systems Inc., The Esys 
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Corporation, PacketWare Inc., NBTel InterActive, 
Nautical Data International Inc. , and Miranda Tech- 
nologies Inc. CANARIE has funded more than 200 
projects involving 500 Canadian companies and pro- 
viding an average of 30% of total project costs 
(CANARIE, 2003). 

BACKGROUND 



Learning Object Repositories (EduSource Canada) 
created in 2002 to develop interoperable learning 
object repositories across Canada. EduSource 
Canada sponsored a number of learning object re- 
pository projects, including Broadband-Enabled Life- 
long Learning Environment (BELLE), Campus 
Alberta Repository of Educational Objects (CAREO), 
and Portal for Online Objects in Learning (POOL). 



c 



Recent broadband-based research and development 
initiatives in areas of interinstitutional networking 
and learning object and learning object repository 
development are particularly relevant to the field of 
human-computer interaction (HCI). A growing num- 
ber of broadband-based research and development 
projects is appearing worldwide, such as, ICONEX 
(UK), JORUM (UK), JISC Information Environ- 
ment (UK), AESharenet (AU), COLIS project (Aus- 
tralia), TALON Learning Objects System (US), and 
Multimedia Educational Resource for Learning and 
Online Teaching (International). 

Over the last decade, a number of important 
interinstitutional networking and learning object re- 
pository initiatives were spearheaded in Canada. 
Through the advancement of grid computing, satel- 
lite communications, and wireless networks, com- 
puters in research labs around the world and in the 
field could be connected to a computer network, 
allowing users to share applications, computer power, 
data, and other resources. Canada’s broadband 
network provided a technology infrastructure for a 
wide range of large-scale research and development 
initiatives, such as virtual astrophysics communities 
(Canadian Virtual Observatory), microelectronic 
online testing (National Microelectronics and 
Photonics Testing Collaboratory), remote satellite 
forest monitoring (SAFORAH), and brain map data- 
base sharing (RISQ), SchoolNet, and the Canadian 
Network of Learning Object Repositories. SchoolNet 
was a federal government institutional networking 
project developed in 1 994 to increase connectivity to 
public schools and to promote social equity by allow- 
ing all Canadian schools and public libraries to be 
interconnected, regardless of geographical distance. 
Through this project, Canada became the first coun- 
try in the world to connect all of its public schools to 
the Information Highway (School Net, 1999). An- 
other major initiative was the Canadian Network of 



NON-TECHNICAL ASPECTS OF 
BROADBAND TECHNOLOGY 

Key Non-Technical Problems of 
Broadband Technology 

A selected review of federal government databases 
on major broadband initiatives in Canada over the 
last decade reveals a number of problems high- 
lighted in government documents and news reports 
on government broadband efforts. Particularly sa- 
lient are problems revolving around public knowl- 
edge, education, and systemic organization. 

Problem of Public Knowledge 

Although more than $1 billion was invested in 
CANARIE’ s projects, very few people know of its 
existence. With little public knowledge of its exist- 
ence, CANARIE is in danger of being eliminated 
through economic cutbacks. Efforts to gain media 
attention have not been effective. Despite the pres- 
ence of numerous CANARIE-sponsored gophers, 
Web sites, and press releases, there is a paucity of 
public information available in popular media. 

Problem of Education 

The success of projects like SchoolNet was mea- 
sured in terms of how many computers there were 
in schools and libraries and how many were con- 
nected to the Internet. One major criticism was that 
efforts from project leaders to promote public inter- 
est overemphasized the physical aspects of comput- 
ers and connectivity and underemphasized how indi- 
viduals employ technology for educational ends. 
This partly explains resistance from local network 
users to participate in many learning object reposi- 
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tory initiatives currently in progress. There is little 
point in amassing thousands of learning objects for 
users if they do not see how the objects enhance 
learning. 

Problem of Organization 

The organization of broadband and broadband-based 
initiatives can create problems, particularly where 
public support is vital. There is strong potential for 
public interest in the development in Canadian learn- 
ing object repositories and repository networks ac- 
cessible by all Canadians. The problem is that 
EduSource (established in 2002) and some of the 
learning repository initiates emerged almost a decade 
after broadband initiatives were funded. From an 
engineering perspective, it made sense to invest 
infrastructure development first and content second. 
From a public interest perspective, however, earlier 
development of learning repository networks could 
have raised public interest sooner by providing open 
access. For instance, learning object repositories 
could have been created on narrowband networks 
while broadband network development was still un- 
derway. 

Problem of Funding 

Another problem is related to project advancement is 
funding and funding continuity. For instance, the 
future of CANARIE and CANARIE-sponsored ini- 
tiatives was uncertain in 2003 due to delays in federal 
funding commitment to renew the project (Anony- 
mous, 2003). The trickledown effect of funding un- 
certainty for umbrella projects like CANARIE af- 
fected researchers and developers across the county. 
The suspension of funding slowed and, in some 
cases, eliminated projects that previously were 
funded. 



FUTURE TRENDS 
Lessons Learned 

A focus on non-technical aspects of broadband tech- 
nology and broadband-based developments revealed 
problems revolving around public knowledge, educa- 



tion, organization, and funding. Based on the afore- 
mentioned problems, valuable lessons can be de- 
rived to advance future development within Canada 
and in similar contexts. First, the existing problem of 
public knowledge suggests that increased public 
understanding of broadband technology is needed. 
One option is to gain greater private-sector partici- 
pation in promotional activities connected to 
CANARIE (i.e., supporting charities, scholarships, 
donations to public institutions, etc.), since compa- 
nies are interested in gaining popular support. Greater 
private-sector interest in CANARIE could increase 
media attention. On the other hand, there also must 
be careful consideration of the value structure 
governing stakeholder participation in broadband- 
based initiatives in the public sphere. Parrish (2004) 
also raises similar concerns in discussing the value 
of metadata development recommendations from 
organizational entities like the ADRIADNE Foun- 
dation and the Advanced Distributed Learning 
(ADL) Initiative, which have invested financial 
interests. Next, the problem of education and the 
underemphasis on how diverse individuals in differ- 
ent contexts employ technology for educational 
ends suggests that that more efforts must be in- 
vested into building pedagogical elements into broad- 
band technology development that satisfy the needs 
of diverse populations in different contexts. This is 
particularly important in the case of learning object 
and learning object repositories, where learning is a 
major concern in project development. Whiley (2003) 
and Friesen (2003) make similar points in recent 
criticisms of learning object repository trends. Also, 
existing problems of organization suggest that the 
sequence of developmental efforts is also an impor- 
tant consideration. Assuming that public support is 
vital, the organization of broadband and broadband- 
related development efforts should take public in- 
terest into consideration when planning develop- 
mental phases. In the case of Canada, if more effort 
were placed on learning object repository develop- 
ment in the early 1990s, then stronger public support 
of current broadband technology could have oc- 
curred. Moreover, existing financial problems sug- 
gest that more stable funding investment for large- 
scale projects like CANARIE is crucial. 

The aforementioned problems concerning broad- 
band development are not limited to the Canadian 
context. For instance, financial difficulties in the 
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U. S . arising from limited public interest in broadband 
services (i.e., Excite@Home) destabilized broad- 
band development by the end of the 1990s. As was 
the case in Canada, limited public knowledge and 
interest were key elements. Despite the efforts of a 
small group of researchers, developers, and compa- 
nies to explore the potential of broadband, the gen- 
eral population was unaware of the benefits and the 
challenges that broadband introduced into their lives. 



work, learn, entertain themselves, and access fun- 
damental government services (Ostry, 1994). How- 
ever, the general public is still unaware of what 
broadband can do for them. If people are to capital- 
ize on these new technologies in order to improve 
their standard of living and their quality of life, then 
greater attention to non-technical aspects of broad- 
band technology in the field of human-computer 
interaction (HCI) is required. 



c 



Defining Features of Non-Technical 

Dimension of Broadband CONCLUSION 



Based on the discussion of broadband technology, 
broadband-based developments, and non-technical 
aspects of broadband, this section identifies key non- 
technical criteria that must be satisfied in order to 
develop a broadband application that will be useful in 
a national context. Toward this end, Table 1 de- 
scribes four key non-technical dimensions of broad- 
band development: public knowledge, education, or- 
ganization, and funding continuity. 

The dimensions listed in Table 1 are not intended 
to be an exhaustive list. Rather, they are to be used 
as a foundation for exploring other non-technical 
questions related to broadband technology and de- 
velopmental strategies for broadband application 
planning. Broadband and broadband-based applica- 
tions have the potential to change the way people 



The purpose of this article was to extend under- 
standing of non-technical aspects of broadband tech- 
nology by focusing on Canadian broadband technol- 
ogy and broadband-based application development. 
It was based on the apparent lack of attention to non- 
technical human aspects of broadband develop- 
ment. The article explored how problems of public 
awareness, education, organization, and funding in- 
fluenced broadband development in Canada. Based 
on lessons learned, the article posited four important, 
non-technical dimensions of broadband develop- 
ment intended to be used as a foundation for explor- 
ing other non-technical questions related to broad- 
band technology and developmental strategies for 
broadband application planning. 



Table 1. Four important, non-technical dimensions of broadband development 



Non-Technical 

Dimension 




Defining Features 


Public Knowledge 


• 

• 

• 


Public knowledge of broadband technology and its application 
Public knowledge of broadband applicability to diverse cultures 
and contexts 

Public knowledge of public access opportunities to broadband- 
based services 


Education 


• 

• 

• 


Integration of diverse perspectives to guide broadband 
development and its conceptualization 

Inclusion of pedagogical and instructional design considerations 
in new developments 

Emphasis on broadband applications in areas of learning and 
skills development 


Organization 


• 

• 


Involvement from all stakeholder groups in project planning 
and development 

Indication of multi-level broadband planning and development 


Funding 


• 

• 


Indication of adequate social and economic benefits of 
broadband 

Indication of adequate funding and funding continuity to 
support new developments 
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KEY TERMS 

Broadband: Refers to Internet connection 
speeds greater than narrowband connection speed 
of 56kbs. 

Grid Computing: Finking computers from dif- 
ferent locations to a computer network, which al- 
lows users to share applications, computer power, 
data, and other resources. 

Human-Computer Interaction: The study of 
relationships between people and computer tech- 
nologies and the application of multiple knowledge 
bases to improve the benefits of computer technol- 
ogy for society. 

Learning Objects: Fearning objects are reus- 
able digital assets that can be employed to advance 
teaching and learning. 

Learning Object Repositories: Digital re- 
sources within a structure accessible through a 
computer network connection using interoperable 
functions. 

Satellite Communications: The amplification 
and transmission of signals between ground stations 
and satellites to permit communication between any 
two points in the world. 

Telecommunications: The exchange of infor- 
mation between computers via telephone lines. This 
typically requires a computer, a modem, and com- 
munications software. 

Wireless Networks: Computer networking that 
permits users to transmit data through a wireless 
modem connecting the remote computer to Internet 
access through radio frequency. 
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INTRODUCTION 

Interface evaluation of a software system is a 
procedure intended to identify and propose solutions 
for usability problems caused by the specific soft- 
ware design. The term evaluation generally refers to 
the process of “gathering data about the usability of 
a design or product by a specified group of users for 
a particular activity within a specified environment 
or work context” (Preece et al., 1994, p. 602). As 
already stated, the main goal of an interface evalu- 
ation is to discover usability problems. A usability 
problem may be defined as anything that interferes 
with a user’s ability to efficiently and effectively 
complete tasks (Karat et al., 1992). 

The most applied interface evaluation method- 
ologies are the expert-based and the empirical (user- 
based) evaluations. Expert evaluation is a relatively 
cheap and efficient formative evaluation method 
applied even on system prototypes or design speci- 
fications up to the almost-ready-to-ship product. 
The main idea is to present the tasks supported by 
the interface to an interdisciplinary group of experts, 
who will take the part of would-be users and try to 
identify possible deficiencies in the interface design. 

According to Reeves (1993), expert-based evalu- 
ations are perhaps the most applied evaluation strat- 
egy. They provide a crucial advantage that makes 
them more affordable compared to the empirical 
ones; in general, it is easier and cheaper to find 
experts rather than users who are eager to perform 
the evaluation. The main idea is that experts from 
different cognitive domains (at least one from the 
domain of HCI and one from the cognitive domain 



under evaluation) are asked to judge the interface, 
everyone from his or her own point of view. It is 
important that they all are experienced, so they can 
see the interface through the eyes of the user and 
reveal problems and deficiencies of the interface. 
One strong advantage of the methods is that they 
can be applied very early in the design cycle, even on 
paper mock-ups. The expert’s expertise allows the 
expert to understand the functionality of the system 
under construction, even if the expert lacks the 
whole picture of the product. A first look at the basic 
characteristics would be sufficient for an expert. On 
the other hand, user-based evaluations can be ap- 
plied only after the product has reached a certain 
level of completion. 

BACKGROUND 

This article focuses on the expert-based evaluation 
methodology in general and on the walkthrough 
methodologies in particular. The Cognitive Graphi- 
cal Jogthrough method, described in detail in 
Demetriades et al. (1999) and Karoulis et al. (2000), 
belongs to the expert-based evaluation methodolo- 
gies. Its origin is in Poison et al.’s (1992) work, 
where the initial Cognitive Walkthrough was pre- 
sented (Poison etal., 1992; Wharton et al., 1994) and 
in the improved version of the Cognitive Jogthrough 
(Aedoetal., 1996; Catenazzi et al., 1997; Rowley & 
Rhoades, 1992). The main idea in Cognitive 
Walkthroughs is to present the interface-supported 
tasks to a group of four to six experts who will play 
the role of would-be users and try to identify any 
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possible deficiencies in the interface design. In order 
to assess the interface, a set of tasks has to be 
defined that characterizes the method as task-based. 
Every task consists of a number of actions that 
complete the task. The methods utilize an appropri- 
ately structured questionnaire to record the evalua- 
tors’ ratings. They also are characterized as cogni- 
tive to denote that the focus is on the cognitive 
dimension of the user-interface interaction, and spe- 
cial care should be given to understand the tasks in 
terms of user-defined goals, not just as actions on the 
interface (click, drag, etc.). 

The evaluation procedure takes place as follows: 

• A presenter describes the user’s goal that has 
to be achieved by using the task. Then the 
presenter presents the first action of the first 
task. 

• The evaluators try to (1) pinpoint possible 
problems and deficiencies during the use of the 
interface and (2) estimate the percentage of 
users who will possibly encounter problems. 

• When the first action is finished, the presenter 
presents the second one and so forth, until the 
whole task has been evaluated. Then, the pre- 
senter introduces the second task, following 
the same steps. This iteration continues until all 
the tasks are evaluated. 

• The evaluators have to answer the following 
questions in the questionnaire: 

1. How many users will think this action is 
available? 

2. How many users will think this action is 
appropriate? 

3. How many users will know how to per- 
form the action? (At this point, the pre- 
senter performs the action) 

4. Is the system response obvious? Yes/No 

5. How many users will think that the system 
reaction brings them closer to their goal? 

These questions are based on the CE+ theory of 
exploratory learning by Poison et al. ( 1 992) (Rieman 
et al., 1995). Samples of the evaluators’ question- 
naire with the modified phrasing of the questions 
derived from the studies considered here can be 
obtained from http://aiges.csd.auth.gra/academica. 



THE COGNITIVE GRAPHICAL 
WALK- AND JOG-THROUGH 
METHODS (CGW/CGJ) 

The basic idea in modifying the walk- and jog- 
through methods was that they both focus on novice 
or casual users who encounter the interface for the 
first time. However, this limits the range of the 
application of the method. Therefore, the time factor 
was introduced by recording the user’s experience 
while working in the interface. This was 
operationalized through the embodiment of diagrams 
in the questionnaires to enable the evaluators to 
record their estimations. The processing of the 
diagrams produces curves, one for each evaluator; 
so, these diagrams graphically represent the intuition 
and the learning curve of the interface. The learning 
curve in its turn is considered to be the main means 
of assessing the novice-becoming-expert pace, which 
is the locus of this modification. 

Two main types of diagrams are suggested in 
Figure 1. 

The differentiation of the diagrams refers mainly 
to their usability during the sessions, as perceived by 
the evaluators. The main concern of the applications 
was to pinpoint the easiest diagram form to use. 

THE FOUR APPLICATIONS 
Application I: The Network Simulator 

The modified method of the Graphical Jogthrough 
was first applied for the evaluation of an educational 
simulation environment, the Network Simulator. Any 
simulation is a software medium that utilizes the 
interactive capabilities of the computer and delivers 
a properly structured environment to the learner, 
where user-system interaction becomes the means 
for knowledge acquisition (Demetriades et al., 1999). 
Consequently, the main characteristics of a simula- 
tion interface that can and must be evaluated are 
intuitiveness (using proper and easily understand- 
able metaphors), transparency (not interfering with 
the learning procedure) (Roth & Chair, 1997), as 
well as easy mapping with the real world (Schank & 
Cleary, 1996). 
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Figure 1. 
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A. The digital form of the diagram (utilizing boxes) and countable assessment (horizontal axis in attempts) 




B. The analog type of the diagram (utilizing lines) and internal assessment (horizontal axis depicts experience) 



Application II: Perivallon 

The second application of the modified method con- 
cerned a software piece called Perivallon. This edu- 
cational software addressed to junior high school 
students is aimed at supporting the multimedia tuition 
of biology and is described analytically in Karavelaki 
et al. (2000). 

In this particular evaluation, there is additional 
evidence concerning the reliability of the method, 
since the results of this session have been compared 
and combined with the results of an empirical (i.e., 
user-based) evaluation session that followed. The 
details of this approach as well as its results can be 
found in Karoulis and Pombortsis (2000). In brief, the 
results proved the method to be unexpectedly reli- 
able. The goal-task-action approach was followed in 



both sessions. This approach led to an accordance 
of over 60% of the expert-based results when 
compared to the empirical ones. 

Application III: Orestis 

The next application of the method was done in 
order to evaluate an educational CD called Orestis, 
which was produced in our laboratory within the 
framework of an EU-funded program. The soft- 
ware to be evaluated lies in the category of CBL 
(Computer-Based Learning) software. 

This session was organized using a minimalist 
concept; namely, with the participation of only four 
evaluators and with the fewest possible resources 
in order to assess the efficiency of the method, in 
the case where only the bare essentials are used. 
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Application IV: Ergani CD 

This particular evaluation took place during the first 
software usability seminar at the Department of 
Informatics of the University of Athens. The soft- 
ware under evaluation is called Business Plan: How 
to Start and Manage Your Own Business. It is 
vocational software that was constructed by the 
company Polymedia. The main characteristic of this 
session was that all the evaluators were informatics 
experts, and all except one HCI were experts. 

OVERALL RESULTS 
AND DISCUSSION 

What was actually performed in the present work is 
an empirical observational evaluation of the modi- 
fied CGJ method. The evaluators actually acted as 
users who perceived the method as a tool for achiev- 
ing their goal, which was the evaluation of the 
software under consideration. So, through the utili- 
zation of this tool in four applications, qualitative 
results were collected. Therefore, the results of this 
study are, in fact, the results of an empirical (user- 
based) evaluation of the modified methods of CGW/ 
CGJ. 

Throughout all the applications, the use of the 
diagrams was characterized as successful. After a 
short introduction in which the presenter described 
the method and the use of the diagrams, the evalu- 
ators were able to easily record their assessments 
during the session by ticking the appropriate boxes. 
There was no indication that assessing the augmen- 
tation of the user’s experience during the use of the 
interface would be a difficult task; on the contrary, 
the diagrams were considered to be the appropriate 
tool for assessing this variable, and they quickly 
became transparent in their use. 

The minimized conduct of one of the sessions 
brought controversial results. The unexpected oc- 
currence of a dispute concerning the notion of the 
expert evaluator made it necessary to have a camera 
clear; however, the whole minimalist approach of 
this session speeded up the procedure. A tentative 
conclusion is that this minimalist design provides the 
best cost/performance combination; however, it is 
inadequate in case a dispute occurs. 



A research question in this study was also whether 
these modified versions are applicable to interfaces 
of a broader scope. The research has shown that the 
modified methods performed efficiently in both mul- 
timedia and educational interfaces, and valuable 
results with fewer resources (time, money, effort, 
etc.) were obtained. 

However, one always should bear in mind that 
these methods belong to the category of expert- 
based methodologies, and consequently, they show 
the respective advantages and disadvantages. The 
advantages of these methods are that they are 
inexpensive and efficient in relation to the resources 
that are needed, since only a few experts are able to 
pinpoint significant problems. Moreover, they can be 
conducted at almost every stage of the design pro- 
cess, from the system specification stage to the 
construction phase of the product. 

On the other hand, there are certain drawbacks. 
The evaluators must be chosen carefully so that they 
are not biased. Moreover, the nature of the method 
is such that it focuses on the specific tasks and 
actions, often missing the point in the overall facet of 
the system. 

FUTURE TRENDS 

There are some suggestions for further investigation 
regarding the general reliability of the method. A 
fundamental issue that should be examined is the 
validation of the suggestion regarding the augmenta- 
tion of the reliability of the methods by means of its 
combination with empirical sessions. Therefore, there 
is a need a for more combinatory evaluations evi- 
dence, as suggested in Karoulis and Pombortsis 
(2000). 

Another issue for further investigation is the ideal 
composition of the evaluating team. Which disci- 
plines are considered to be of most importance? 
Which are the desired qualities and skills of the 
evaluators? However, often the appropriate evalua- 
tors are not available. In this case, what are the 
compromises one can make, and how far can the 
results be threatened in the case of having evalua- 
tors with limited qualities and skills? Finally, are 
teachers of informatics, because of their additional 
cognitive background, an appropriate group for se- 
lecting the evaluators? Of course, these are all major 
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questions not yet addressed, the answers to which 
might significantly enhance the potential of the meth- 
ods. 



CONCLUSION 

About the Cognitive Graphical 
Jogthrough 



2. How many users will consider this action, 
and not some other, to be appropriate for 
the intended goal? 

3. How many users will know how to per- 
form this action? 

4. Is the system response obvious? 

5. How many users will consider that the 
system response brings them closer to 
their goal? 



c 



While the main aim of the original cognitive 
walkthrough was only to locate problems and mal- 
functions during the use of the interface, the incor- 
poration of the diagrams adds the possibility of 
assessing the intuition and the overall usability of the 
interface. So, in the modified CGW/CGJ methods 
presented in this article, the aforementioned draw- 
backs either are diminished or eliminated. There- 
fore, it can be claimed that the methods provide both 
augmented reliability compared to all previously 
reported studies and augmented usability during 
their application. 

The Final Proposal 

The final form that the method can take now can be 
proposed. The following can be suggested: 



These modified questions are still in accordance 
with the CE+ theory, the theory of exploratory 
learning by Poison et al. (1992) (Rieman et al., 
1995), on which all mentioned methods are based. 

The graphical gradations in the text (bold type 
and smaller-sized characters) are considered to be 
important, since they also contribute in their own 
way to clarifying the meaning of the questions. 

Irrespective of the final form of the evaluation, 
the importance of the appropriate evaluating team 
must be emphasized once more. Cognitive science 
experts are indispensable, since they can lead the 
session along the right path, using their comments 
and also focusing on the cognitive dimension of the 
evaluation. The (anticipated) lack of double experts 
(i.e., those with expertise in the cognitive domain as 
well as in HCI) is a known weakness of the methods. 



The Cognitive Graphical Walkthrough with the 
optional use of a video camera. The taped 
material needs to be processed only in case of 
an emergency, such as an important dispute. 
The conductors of the session are reduced to 
only one person, the presenter, since it has 
been proved (during the last two applications) 
that one person is sufficient. 

The evaluation questionnaire must be printed 
double-sided, so that the evaluators are not 
discouraged by its size. 

The analog type of diagram should be used 
primarily, since it seems to be more favored by 
the evaluators. 

A verbal modification of the questions is nec- 
essary, as follows: 

1 . How many users will think that this action 
is available, namely, that the system can do 
what the user wants and simultaneously af- 
fords the mode for it to be accomplished? 
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KEY TERMS 

Cognitive Graphical Walkthrough: A modifi- 
cation of the initial «Cognitive Walkthrough» inter- 
face evaluation method, which materializes dia- 
grams to enable the evaluators to assess the time 
variable as well as accelerating the evaluation pro- 
cedure. 

Cognitive Jogthrough: A modification of the 
initial «Cognitive Walkthrough» in order to speed up 
the procedure. A video camera now records the 
evaluation session. 

Cognitive Walkthrough: An expert-based in- 
terface evaluation method. Experts perform a 
walkthrough of the interface according to pre-speci- 
fied tasks, trying to pinpoint shortcomings and defi- 
ciencies in it. Their remarks are recorded by a 
recorder and are elaborated by the design team. 

Empirical Interface Evaluation: The empiri- 
cal evaluation of an interface implies that users are 
involved. Known methods, among others, are ob- 
servational evaluations «survey evaluations and 
«thinking aloud protocols 

Expert-Based Interface Evaluation: Evalua- 
tion methodology that employs experts from differ- 
ent cognitive domains to assess an interface. Known 
methods included in this methodology are (among 
others) «heuristic evaluations «cognitive 
walkthroughs and «formal inspections 

Interface Evaluation: Interface evaluation of a 
software system is a procedure intended to identify 
and propose solutions for usability problems caused 
by the specific software design. 

Usability Evaluation: A procedure to assess 
the usability of an interface. The usability of an 
interface usually is expressed according to the fol- 
lowing five parameters: easy to learn, easy to re- 
member, efficiency of use, few errors, and subjec- 
tive satisfaction. 
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INTRODUCTION 

The rich contributions made in the field of human 
computer interaction (HCI) have played a pivotal 
role in shifting the attention of the industry to the 
interaction between users and computers (Myers, 
1998). However, technologies that include hypertext, 
multimedia, and manipulation of graphical objects 
were designed and presented to the users without 
referring to critical findings made in the field of 
cognitive psychology. These findings allow design- 
ers of multimedia educational systems to present 
knowledge in a fashion that would optimize learning. 

BACKGROUND 

The long history of human computer interaction 
(HCI) has witnessed many successes represented 
in insightful research that finds its way to users’ 
desktops. The field influences the means through 
which users interact with computers — from the 
introduction of the mouse (English et al., 1967) and 
applications for text editing (Meyrowitz & Van 
Dam, 1982) to comparatively recent areas of re- 
search involving multimedia systems (Yahaya & 
Sharifuddin, 2000). 

Learning is an activity that requires different 
degrees of cognitive processing. HCI research rec- 
ognized the existence of diversity in learning styles 
(Holt & Solomon, 1996) and devoted much time and 
effort toward this goal. However, Ayre and Nafalski 
(2000) report that the term learning styles is not 
always interpreted the same way and were able to 
offer two major interpretations. The first group 
believes that learning styles emerge from personal- 
ity differences, life experiences, and student learn- 
ing goals, while the second group believes that it 
refers to the way students shape their learning 



method to accommodate teacher expectations, as 
when they follow rote learning when teachers ex- 
pect it. 

The first interpretation includes, in part, a form of 
individual differences but does not explicitly link 
them to individual cognitive differences, which, in 
turn, caused researchers more ambiguities as to 
interpreting the different types of learning styles. In 
fact, these differences in interpretations caused 
Stahl (1999) to publish a critique, where he cites five 
review papers that unite in concluding the lack of 
sufficient evident to support the claim that accom- 
modating learning styles helps to improve children’ s 
learning when acquiring the skill to read. He criti- 
cized Carbo’s reading style inventory and Dunn and 
Dunn’s learning inventory because of their reliance 
on self-report to identify different learning styles of 
students, which, in turn, results in very low replica- 
tion reliability. 

These criticisms are positive in that they indicate 
a requirement to base definitions on formal repli- 
cable theory. A candidate for this is cognitive learn- 
ing theory (CLT), which represents the part of 
cognitive science that focuses on the study of how 
people learn the information presented to them and 
how they internally represent the concepts mentally 
in addition to the cognitive load that is endured during 
the learning process of the concepts. 

Some of the attempts that were made to take 
advantage of the knowledge gained in the field 
include Jonassen (1991), van Jooligan (1999), and 
Ghaoui and Janvier (2004). 

Jonassen (1991) advocates the constructivist 
approach to learning, where students are given 
several tools to help them perform their computation 
or externally represent text they are expected to 
remember. This allows them to focus on the learning 
task at hand. Jonassen (1991) adopts the assumption 
originally proposed by Lajoie and Derry (1993) and 
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Derry ( 1 990) that computers fill the role of cognitive 
extensions by performing tasks to support basic 
thinking requirements, such as calculating or holding 
text in memory, and thus allowed computers to be 
labeled cognitive tools. Jonassen’s (1991) central 
claim is that these tools are offered to students to 
lower the cognitive load imposed during the learning 
process, which, in turn, allows them to learn by 
experimentation and discovery. 

Van Jooligan (1999) takes this concept a step 
further by proposing an environment that allows 
students to hypothesize and to pursue the conse- 
quences of their hypotheses. He did this through 
utilizing several windows in the same educational 
system. The system was composed of two main 
modules: the first supports the hypothesis formation 
step by providing menus to guide the process; the 
second provides a formatted presentation of experi- 
ments already tested and their results in a structured 
manner. They also added intelligent support to the 
system by providing feedback to students to guide 
their hypothesis formation approach. 

Ghaoui and Janvier (2004) presented a two-part 
system. The first part identified the various person- 
ality types, while the second either had an interactive 
or non-interactive interface. They report an in- 
crease in memory retention from 63.57% to 71.09% 
that occurred for the students using the interactive 
interface. They also provided a description of the 
learning style preferences for the students tested, 
which exhibited particular trends, but these were not 
analyzed in detail. 

Montgomery (1995) published preliminary re- 
sults of a study aimed at identifying how multimedia, 
in particular, can be used to address the needs of 
various learning styles. Results indicate that active 
learners appreciate the use of movies and interac- 
tion, while sensors benefit from the demonstrations. 

Although a glimmer of interest in CLT exists, 
there is a distinct lack of a clear and organized 
framework to help guide educational interface de- 
signers. 

ALIGNMENT MAP FOR MULTIMEDIA 
INSTRUCTIONAL INTERFACE 

The problems that arose with learning styles reveal 
a need for a more fine-grained isolation of various 



cognitive areas that may influence learning. Conse- 
quently, an alignment map, as shown in T able 1 , may 
offer some guidelines as to what aspects of the 
multimedia interface design would benefit from what 
branch of the theory in order to gain a clearer 
channel of communication between the designer and 
the student. 



CASE STUDY: DATA STRUCTURES 
MULTIMEDIA TUTORING SYSTEM 

The alignment map presents itself as an excellent 
basis against which basic design issues of multime- 
dia systems may be considered with the goal of 
making the best possible decisions. 

The multimedia tutoring system considered here 
(Albalooshi & Alkhalifa, 2002) teaches data struc- 
tures and was designed by considering the various 
design issues as dictated by the alignment map that 
was specifically designed for the project and is 
shown in Table 1. An analysis of the key points 
follows: 

1. Amount of Media Offered: The system pre- 
sents information through textual and animated 
presentation only. This is done to avoid cogni- 
tive overload caused by redundancy (Jonassen, 
1991) that would cause students to find the 
material more difficult to comprehend. 

2. How the Screen is Partitioned: The screen 
grants two-thirds of the width to the animation 
window that is to the left of the screen, while 
the verbal description is to the right. Although 
the language used for the textual description is 
in English, all students are Arabs, so they are 
accustomed to finding the text on the right side 
of the screen, because in Arabic, one starts to 
write from the right hand side. This design, 
therefore, targeted this particular pool of stu- 
dents to ensure that both parts of the screen are 
awarded sufficient attention. It presents an 
interface that requires divided attention to two 
screens that complement each other, a factor 
that, according toHampson (1989), minimizes 
interference between the two modes of pre- 
sentation. 

3. Parallel Delivery of Information: Redun- 
dancy is desired when it exists in two different 
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Table 1. Alignment map from multi-media design questions to various cognitive research areas that 
may be of relevance 



Multimedia Design Issues 


Cognitive Areas That May Be of Relevance 


1 . Amount of media offered 


1 . Cognitive load 

2. Limited attention span 

3. Interference between different mental 
representations 


2. How the screen is partitioned 


1 . Perception and recognition 

2. Attention 


3. Parallel delivery of information 


1 . Redundancy could cause interference 

2. Limited working memory (cognitive load issues) 

3. Limited attention span 

4. Learner difference 


4. Use of colors 


1 . Affects attention focus 

2. Perception of edges to promote recall 


5. Use of animation 


1 . Cognitive load reduction 

2. Accommodates visualizer/verbalizer learners 


6. Use of interactivity 


1 . Cognitive load reduction 

2. Raises the level of learning objectives 


7. Aural media 


1 . Speech perception issues like accent and clarity 

2. Interference with other media 


8. Verbal presentation of material 


1 . Clarity of communication 

2. Accommodates verbal/serialist learners 



c 



media, because one re-enforces the other. It is 
not desired when it exists within the media, as 
when there is textual redundancy and things are 
explained more than once. Consequently, the 
textual description describes what is presented 
in the animation part, especially since only text 
and animation media exist in this case, which 
means that cognitive load issues are not of 
immediate concern (Jonassen, 1991). 

4. Use of Colors: Colors were used to highlight 
the edges of the shapes and not on a wide scale 
to ensure that attention is drawn to those. By 
doing so, focus is expected to be directed to- 
ward the object’s axes, as suggested by Marr 
and Nishihara (1978), in order to encourage 
memory recall of the shapes at a later point in 
time. 

5. Use of Animation: The animated data struc- 
tures are under the user’s control with respect 
to starting, stopping, or speed of movement. 
This allows the user to select whether to focus 
on the animation, text, or both in parallel without 
causing cognitive overload. 

6. Use of Interactivity: The level of interactivity 
is limited to the basic controls of the animation. 

7. Aural Media: This type of media is not offered 
by the system. 



8. Verbal Presentation of Material: The ver- 
bal presentation of the materials is concise and 
fully explains relevant concepts to a sufficient 
level of detail, if considered in isolation of the 
animation. 



EVALUATION OF THE SYSTEM 

The tool first was evaluated for its educational 
impact on students. It was tested on three groups: 
one exposed to the lecture alone; the second to a 
regular classroom lecture in addition to the system; 
and the third only to the system. Students were 
distributed among the three groups such that each 
group had 15 students with a mean grade similar to 
the other two groups in order to ensure that any 
learning that occurs is a result of the influence of 
what they are exposed to. This also made it possible 
for 30 students to attend the same lecture session 
composed of the students of groups one and two, 
while 30 students attended the same lab session 
composed of the students of groups two and three 
in order to avoid any confounding factors. 

Results showed a highly significant improve- 
ment in test results of the second group when their 
post-classroom levels were compared to their lev- 
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els following use of the multimedia system, with an 
overall improvement rate of 40% recorded with 
F=9.19, with p < 0.005 from results of an ANOVA 
test after ensuring all test requirements had been 
satisfied. The first and third groups showed no 
significant differences between them. 

Results shown indicate that learning did occur to 
the group that attended the lecture and then used the 
system, which implies that animation does fortify 
learning by reducing the cognitive load. This is 
especially clear when one takes the overall mean 
grade of all groups, which is around 10.5, and checks 
how many in each group are above that mean. Only 
six in group one were above it, while 1 1 were above 
it in group two and 10 in group three. Since group 
three was exposed to the system-only option and 
achieved a number very close to group two, which 
had the lecture and the system option, then clearly, 
multimedia did positively affect their learning rate. 

Since one of the goals of the system is to accom- 
modate learner differences, a test was run on group 
two students in order to identify the visualizers from 
the verbalizers. The paper-folding test designed by 
French et al. (1963) was used to distinguish between 
the two groups. The test requires each subject to 
visualize the array of holes that results from a simple 
process. A paper is folded a certain number of folds, 
a hole is made through the folds, and then the paper 
is unfolded. Students are asked to select the image 
of the unfolded paper that shows the resulting ar- 
rangement, and results are evaluated along a median 
split as high vs. low visualization abilities. 

These results then were compared with respect 
to the percentage of improvement, as shown in Table 
2. Notice that the question numbers in the pre-test 



are mapped to different question numbers in the 
post-test in order to minimize the possibility of 
students being able to recall them; a two-part ques- 
tion also was broken up for the same reason. 

Results indicate that, although the group indeed 
was composed of students with different learning 
preferences, they all achieved comparable overall 
improvements in learning. Notice, though, the differ- 
ence in percentage improvement in Question 4. The 
question is: List and explain the data variables that 
are associated with the stack and needed to operate 
on it. This particular question is clearly closer to 
heart to the verbalizer group than to the visualizer 
group. Therefore, it should not be surprising that the 
verbalizer group finds it much easier to learn how to 
describe the data variables than it is for students who 
like to see the stack in operation. Another point to 
consider is that the visualizer group made a bigger 
improvement in the Q1+Q8 group in response to the 
question: Using an example, explain the stack con- 
cept and its possible use. Clearly, this question is 
better suited to a visualizer than to a verbalizer. 



FUTURE TRENDS 

CLT already has presented us with ample evidence 
of its ability to support the design of more informed 
and, therefore, more effective educational systems. 
This article offers a map that can guide the design 
process of a multimedia educational system by high- 
lighting the areas of CLT that may influence design. 
The aim, therefore, is to attract attention to the vast 
pool of knowledge that exists in CLT that could 
benefit multimedia interface design. 



Table 2. The percentage improvement of each group from the pretest to the posttest across the 
different question types 





Q1 PLUS QS 
MAPPED TO 

Q1 


Q3 MAPPED 
TOQ2 


Q4 MAPPED 
TO Q3 


Q6 MAPPED 
TO Q6 


VISUALIZER 

GROUP 


27.8% 


18.6% 


9.72% 


9.76% 


T-TEST 

RESULTS 


.004 


.003 


.09 


.01 


VERBALIZER 

GROUP 


20.7% 


22.8% 


21.4% 


15.7% 


T-TEST 

RESULTS 


.004 


.005 


.003 


.009 
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CONCLUSION 

This article offers a precise definition of what is 
implied by a computer-based cognitive tool (CT) as 
opposed to others that were restricted to a brief 
definition of the concept. Here, the main features of 
multimedia were mapped onto cognitive areas that 
may have influence on learning, and the results of an 
educational system that conforms to these design 
requirements were exhibited. 

These results are informative to cognitive scien- 
tists, because they show that the practical version 
must deliver what the theoretical version promises. 
At the same time, results are informative to educa- 
tional multimedia designers by exhibiting that there is 
a replicated theoretical groundwork that awaits their 
contributions to bring them to the world of reality. 

The main conclusion is that this is a perspective 
that allows designers to regard their task from the 
perspective of the cognitive systems they wish to 
learn so that it shifts the focus from a purely teacher- 
centered approach to a learner-centered approach 
without following the route to constructivist learning 
approaches. 
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KEY TERMS 

Alignment Map: A representation on a surface to 
clearly show the arrangement or positioning of relative 
items on a straight line or a group of parallel lines. 

Attention: An internal cognitive process by 
which one actively selects which part of the environ- 
mental information that surrounds them and focuses 
on that part or maintains interest while ignoring 
distractions. 

Cognitive Learning Theory: The branch of 
cognitive science that is concerned with cognition 



and includes parts of cognitive psychology, linguis- 
tics, computer science, cognitive neuroscience, and 
philosophy of mind. 

Cognitive Load: The degree of cognitive pro- 
cesses required to accomplish a specific task. 

Learner Differences: The differences that exist 
in the manner in which an individual acquires infor- 
mation. 

Multimedia System: Any system that presents 
information through different media that may in- 
clude text, sound, video computer graphics, and 
animation. 
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INTRODUCTION 

Traditionally, programming code that is used to 
construct software user interfaces has been inter- 
twined with the code used to construct the logic of 
that application’s processing operations (e.g., the 
business logic involved in transferring funds in a 
banking application). This tight coupling of user- 
interface code with processing code has meant that 
there is a static link between the result of logic 
operations (e.g., a number produced as the result of 
an addition operation) and the physical form chosen 
to present the result of the operation to the user (e.g., 
how the resulting number is displayed on the screen). 
This static linkage is, however, not found in in- 
stances of natural human-to-human communication. 

Humans naturally separate the content and mean- 
ing that is to be communicated from how it is to be 
physically expressed. This creates the ability to 
choose dynamically the most appropriate encoding 
system for expressing the content and meaning in 
the form most suitable for a given situation. This 
concept of interchangeable physical output can be 
recreated in software through the use of contempo- 
rary design techniques and implementation styles, 
resulting in interfaces that improve accessibility and 
usability for the user. 

BACKGROUND 

This section accordingly reviews certain theories of 
communication from different disciplines and how 
they relate to separating the meaning being commu- 
nicated from the physical form used to convey the 
meaning. 



Claude Shannon (1948), a prominent researcher 
in the field of communication theory during the 20 th 
century, put forward the idea that meaning is not 
transmitted in its raw form, but encoded prior to 
transmission. Although Shannon was primarily work- 
ing in the field of communication systems and net- 
works such as those used in telephony, his theory has 
been adopted by those working in the field of human 
communications. Shannon proposed a five-stage 
model describing a communication system. Begin- 
ning with the first stage of this model, the sender of 
the communication creates some content and its 
intended meaning. In the second stage, this content 
is then encoded into a physical form by the sender 
and, in the third stage, transmitted to the receiver. 
Once the communication has been received by the 
receiver from the sender, it is then at its fourth stage, 
whereby it is decoded by the receiver. At the fifth 
and final stage, the content and meaning communi- 
cated by the sender become available to the re- 
ceiver. 

An example of how Shannon’s (1948) model can 
be applied to human communication is speech-based 
communication between two parties. First, the sender 
of the communication develops some thoughts he or 
she wishes to transmit to the intended receiver of the 
communication. Following on from the thought-gen- 
eration process, the thoughts are then encoded into 
sound by the vocal cords, and further encoded into a 
particular language and ontology (i.e., a set of map- 
pings between words and meaning) according to the 
sender’s background. This sound is subsequently 
transmitted through the air, reaching the receiver’s 
ears where it is decoded by the receiver’s auditory 
system and brain, resulting in the thoughts of the 
sender finally being available to the receiver. 
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This split between meaning, its encoding, and the 
physical transmission of the meaning is recognised in 
psychology. Psychology considers that there are 
three stages to receiving data: (a) the receiving of 
sensory stimuli by a person, (b) the perception of 
these stimuli into groups and patterns, and (c) the 
cognitive processing of the groups and patterns to 
associate cognitively the meaning with the data 
(Bruno, 2002). Thus, for example, a receiver may 
see a shape with four sides (the data) and associate 
the name square (the meaning) with it. There is 
accordingly a split between the input a person re- 
ceives and the meaning he or she cognitively asso- 
ciates with that input. 

Consider, for example, the words on this page as 
an example of the psychological process through 
which meaning is transmitted. The first stage of the 
process is where the reader receives sensory stimuli 
in the form of black and white dots transmitted to the 
eyes using light waves of varying wavelength. Upon 
the stimuli reaching the reader, the brain will percep- 
tually group the different dots contained within the 
received stimuli into shapes and, ultimately, the 
reader will cognitively associate the names of letters 
with these shapes and extract the meaning conveyed 
by the words. 

Semiotics, which is the study of signs and their 
meanings (French, Polovina, Vile, & Park, 2003; 
Liu, Clarke, Anderson, Stamper, & Abou-Zeid, 2002), 
also indicates a split between meaning and its physi- 
cal presentation. Within semiotics, the way some- 
thing is presented, known as a sign, is considered to 
be separate from the meaning it conveys. Accord- 
ingly, in semiotics there are three main categories of 
signs: icons, indexes, and symbols. This delineation 
is, however, not mutually exclusive as a particular 
sign may contain elements of all categories. Vile and 
Polovina (2000) define an icon as representative of 
the physical object it is meant to represent; a symbol 
as being a set of stimuli, that by agreed convention, 
have a specific meaning; and indexes as having a 
direct link to a cause, for example, the change of a 
mouse pointer from an arrow shape to an hourglass 
to reflect the busy state of a system. 

This classification of the physical representation 
according to its relationship with the content and 
meaning it conveys provides further opportunities to 
distinguish content and meaning from its physical 
presentation, and to classify the different elements 



of presentation. For example, a shop selling shoes 
may have a sign outside with a picture of a shoe on 
it. The image of the shoe is the sign, or the physical 
presence of the meaning, which in this case is an 
icon, while the fact that it is a shoe shop is the 
intended meaning. Equally, this could be represented 
using the words shoe shop as the physical sign, in 
this case a symbol of the English language, while the 
meaning is again that of a shoe shop. 

This split of content and meaning from its physi- 
cal presentation, which occurs naturally in human 
communication, allows for the same content and 
meaning to be encoded in a variety of different forms 
and encoding methods. For example, the meaning of 
“no dogs allowed” can be encoded in a variety of 
visual images. For instance, there might be (a) an 
image of a dog with a cross through it, (b) the words 
“no dogs allowed,” (c) an auditory sequence of 
sounds forming the words “no dogs allowed,” or (d) 
the use of tactile alphabets such as Braille, which is 
used to encode printed writing into a form for the 
blind. However the content and meaning is con- 
veyed, it remains the same regardless of how it is 
physically presented. 

SOFTWARE ARCHITECTURES FOR 
CONTENT SEPARATION 

For the true separation of presentation from content 
to occur therefore in software, the content (namely 
the data or information itself as well as the application’ s 
operations, i.e., its business logic as indicated ear- 
lier) is stored in a neutral format. This neutrality is 
achieved when the content is untainted by presenta- 
tion considerations. This allows any given content to 
be translated and displayed in any desired presenta- 
tion format (e.g., through an HTML [hypertext 
markup language] Web browser such as Microsoft’s 
Internet Explorer, as an Adobe Acrobat PDF [Por- 
table Document Format], as an e-book, on a mobile 
phone, on a personal digital assistant [PDA], or 
indeed on any other device not mentioned or yet to 
be invented). The theories of detaching content and 
meaning from its physical presentation thus give a 
framework to separate content from presentation. 
Once that conceptual separation can be made, or at 
least continually realisable ways toward it are 
achieved, then this approach can actually be de- 
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ployed in the design and implementation of computer 
systems. 

There are a number of methods offered by con- 
temporary software languages and architectures to 
achieve this detachment between the content and 
meaning, and how the content can thus be displayed. 
In the sphere of Web development, the extensible 
markup language (XML) is one such example. XML 
provides a useful vehicle for separating presentation 
from content (Quin, 2004a). Essentially, unlike HTML 
in which the tags are hard coded (e.g., Head, Body, 
HI, P, and so forth), XML allows designers or 
developers to define their own tags particular to their 
domain (e.g., Name, Address, Account-number, 
Transactions, Debits, Credits, and so forth in, say, 
a banking scenario). How this content is presented 
has, of course, to be defined by the designer or 
developer; he or she can no longer rely on the 
browser to format it by simply recognising the hard- 
coded HTML tags. The extensible stylesheet lan- 
guage (XSL) is the vehicle to achieve this (Quin, 
2004b). Equally, the scaleable vector graphics (SVG) 
format, based on XML, is another World Wide Web 
format capable of separating content and meaning 
from presentation. SVG specifies drawing objects, 
their dimensions, colour, and so forth, but leaves the 
determination of presentation modality to the client 
viewer application (Ferraiolo, Jun, & Jackson, 2003). 

Within enterprise systems, this separation can be 
achieved through the use of object-orientated and 77- 
tier design methodologies. Object orientation works 
through its embodiment of the four goals of software 
engineering (Booch, 1990; Meyer, 1988; Polovina & 
Strang, 2004). These four goals of software engi- 
neering, namely (a) abstraction, (b) cohesion, (c) 
loose coupling, and (d) modularity, determine the 
principled design of each object that makes up the 
system. They seek to ensure that the object only 
performs the functions specific to its role, for ex- 
ample, to display a piece of information or to perform 
a calculation. Accordingly, these goals seek to en- 
sure that presentation objects only present the infor- 
mation, while logic objects only perform calculations 
and other business-logic operations. These content 
objects thus do not concern themselves with how the 
information is presented to the user; instead these 
content objects communicate their information via 
presentation objects to perform this function. 



In addition to embodying the four goals of soft- 
ware engineering, object orientation builds on these 
by providing three further principles: (a) encapsula- 
tion, (b) inheritance, and (c) polymorphism (Booch, 
1990; Meyer, 1988; Polovina & Strang, 2004). 
Inheritance allows an object to inherit the charac- 
teristics and behaviours of another object. Utilising 
this feature, it is possible to extend the functionality 
of an object to include new functionality, which may 
be new buttons or other interface elements within a 
user interface. Polymorphism is used to select an 
object based on its ability to meet a given set of 
criteria when multiple objects perform similar func- 
tions. For example, there may be two objects re- 
sponsible for displaying the same interface ele- 
ment; both display the same content and meaning, 
but using different languages. In this scenario, the 
concept of polymorphism can be used to select the 
one appropriate for the language native to the user. 
Thus, object-orientated design can be used to natu- 
rally compliment the process of separating content 
and meaning from its method of presentation. 

A common practice within the field of software 
engineering is to base software designs on common, 
predefined architectures, referred to as patterns. 
One pattern, which lends itself well to the separa- 
tion of content and meaning from its method of 
presentation, is the 77 -tier architecture. The 77 -tier 
architecture separates the objects used to create 
the design for a piece of software into layers 
(Fowler, 2003). The objects contained within each 
layer perform a specific group of functions, such as 
data storage. In the three-tier architecture, for 
example, one layer is responsible for handling the 
software’s input and output with the user, another 
handles its business-logic processes, and the final 
layer handles the persistent storage of information 
between sessions of the software being executed. 
Through the use of an 77 -tier architecture and the 
separation of the different areas of an application’s 
design that it creates, it is possible to separate the 
content from its mode of presentation within soft- 
ware design. 

Software engineering’s ability to separate con- 
tent and meaning from its physical presentation can 
be aided by some contemporary implementation 
methods. These methods are based on component 
architectures that aim to create reusable segments 
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of code that can be executed. This enhances object 
orientation, which seeks to create reusable seg- 
ments of software at the source-code level. While 
there is not much difference in the design, having 
reusable segments of executable code translates to 
faster time to change segments, further enhancing 
the plug-and-play nature of software. Microsoft’s 
Component Object Model (COM) is a client-side 
Windows-based component architecture (Microsoft 
Corporation, 1998). This architecture enables pro- 
grams to be built as individual components that are 
linked together using a client application to form a 
complete software program. This approach to soft- 
ware implementation provides the ability to con- 
struct similar pieces of software using the same 
components, where the functionality is common 
between the pieces of software. For example, if the 
storage and logic elements of a piece of software 
were to remain the same but the user interface were 
to be changed due to the differing needs of user 
groups, the same components forming the storage 
and logic sections could be used for all versions of 
the software. Furthermore, this could occur while 
different components were created to provide the 
different user interfaces required. This method would 
reduce the time taken to build and deploy the soft- 
ware amongst a group of diverse users. 

Another implementation technique, built around 
distributed components located on different physical 
machines, are Web services (MacDonald, 2004). 
Instead of the components used to build the software 
being located on the same machine, different com- 
ponents can be placed on different machines. This 
results in users being able to share and access the 
same physical instance of objects. This enhances 
COM, which although it gives access to the same 
components, forces each user to use different in- 
stances of them. One advantage of Web services is 
that they allow the existence of different user inter- 
faces while letting users access the same physical 
objects used for the logic and storage processes. 
This type of deployment will ensure that all users are 
accessing the same data through the same logic 
processes, but allows the flexibility for each user or 
user group to use an interface that is the most 
optimal for their needs, be they task- or device- 
dependant needs. 



THE HUMAN-COMPUTER 
INTERACTION BENEFITS 

The human race rarely uses fixed associations be- 
tween content or meaning and its physical represen- 
tation. Instead, people encode the meaning into a 
form appropriate for the situation and purpose of the 
communication. Communication can be encoded 
using different ontologies such as different lan- 
guages and terminology. Communication is thus able 
to take different physical channels (e.g., sound 
through the air, or writing on paper), all of which 
attempt to ensure that the content or meaning is 
communicated between the parties in the most accu- 
rate and efficient manner available for the specific 
characteristics of the situation. Currently, this is not 
the case with computer interfaces; contemporary 
interfaces instead tend to adopt a “one size fits all” 
approach for the majority of the interface. 

In taking this one-size-fits-all approach, content 
and meaning may not be transmitted to the user in the 
most accurate form, if it is communicated at all. The 
characteristics of the situation and participants are 
not taken into account. This makes the interface 
harder to use than might be, if it can be used at all. 
Some users, such as those with a sensory disability 
or those with a different native language, may not be 
able to access the information as it has been encoded 
using an inaccessible physical form (e.g., visual 
stimuli are inaccessible for the blind). Or it has been 
encoded using a foreign language, which the user 
does not understand. This immediately prevents the 
user from accessing the content and meaning con- 
veyed by that form of presentation. 

Equally, terminology can be prohibitive to the 
ease of use of a user interface. The set of terms that 
we know the meaning for (i.e., ontology) is based on 
factors such as the cultural, educational, and social 
background of the user as well as the geographic 
area the user inhabits. This leads to different groups 
of people being familiar with different terminology 
from those in other groups, although there is some 
degree of overlap in the ontologies used by the 
different groups. The user is forced to learn the 
terminology built into the interface before they can 
extract the meaning that it conveys. This imposes a 
learning curve on the user, unless they are already 
familiar with the particular set of terms used. Hence, 
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by using a one-size-fits-all user interface, some 
users will find it difficult or impossible to use. 

By utilising the facilities offered by contempo- 
rary software-engineering practices, it is possible to 
avoid this one-size-fits-all approach and its inherent 
disadvantages in terms of human-computer interac- 
tion. By allowing the encoding scheme used to 
present software interfaces to change with different 
users, interfaces will begin to mimic the processes 
used to encode content and meaning that are found 
in natural human-to-human communication. This 
change will result in interfaces that are accessible by 
those who could not previously access them, and will 
also result in greater ease of use for those who 
previously had to learn the terminology used within 
the interface, hence improving interface usability. 



FUTURE TRENDS 

One emerging trend is the use of explicit user 
modeling to modify the behaviour and presentation 
of systems based on a user’s historic use of that 
system (Fischer, 2001). Explicit user modeling in- 
volves tracking the preferences and activities of a 
user over time, and building a model representing 
that behaviour and associated preferences. This, 
coupled with the concept of presenting the content 
and meaning in the form most suitable for the user, 
holds the ability to tailor the content to a specific 
individual’s needs. By monitoring how a user re- 
ceives different types of information over time, a 
historic pattern can be developed that can subse- 
quently be used to present the content and meaning 
based on an individual’ s actual requirements, not on 
a generalized set of requirements from a specific 
group of users. 



CONCLUSION 

Currently, by entwining the association between 
content and meaning and the physical form used to 
represent it, software user interfaces do not mimic 
natural human-to-human communication. Within 
natural communication, the content and meaning 
that is to be conveyed is detached from its physical 
form, and it is only encoded into a physical form at 



the time of transmission. This timing of the point at 
which the content and meaning are encoded is 
important. It gives the flexibility to encode the 
content and meaning in a form that is suitable for the 
characteristics of the situation (e.g., the channels 
available, the languages used by the parties, and the 
terminology that they know). This ensures that 
humans communicate with each other in what they 
consider to be the most appropriate and accurate 
manner, leading to encoding schemes from which 
the parties can access the content and meaning in an 
easy method. 

This is not currently the case for software user 
interfaces, which use a too tightly coupled associa- 
tion between the content and meaning and the 
physical form used to encode it. By utilising contem- 
porary Web-based or object-orientated component 
architectures, this problem of fixed encoding schemes 
can be overcome. Therefore, software user inter- 
faces can more closely mimic natural language 
encoding and gain all the benefits that it brings. 
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KEY TERMS 

Accessibility: The measure of whether a per- 
son can perform an interaction, access information, 
or do anything else. It does not measure how well he 
or she can do it, though. 

Content: The information, such as thoughts, 
ideas, and so forth, that someone wishes to commu- 
nicate. Examples of content could be the ideas and 



concepts conveyed through this article, the fact that 
you must stop when a traffic light is red, and so on. 
Importantly, content is what is to be communicated 
but not how it is to be communicated. 

Encoding: Encoding is the process by which the 
content and meaning that is to be communicated is 
transformed into a physical form suitable for com- 
munication. It involves transforming thoughts and 
ideas into words, images, actions, and so forth, and 
then further transforming the words or images into 
their physical form. 

Object Orientation: A view of the world based 
on the notion that it is made up of objects classified 
by a hierarchical superclass-subclass structure un- 
der the most generic superclass (or root) known as 
an object. For example, a car is a (subclass of) 
vehicle, a vehicle is a moving object, and a moving 
object is an object. Hence, a car is an object as the 
relationship is transitive and, accordingly, a subclass 
must at least have the attributes and functionality of 
its superclass(es). Thus, if we provide a generic 
user-presentation object with a standard interface, 
then any of its subclasses will conform to that 
standard interface. This enables the plug and play of 
any desired subclass according to the user’s encod- 
ing and decoding needs. 

Physical Form: The actual physical means by 
which thoughts, meaning, concepts, and so forth are 
conveyed. This, therefore, can take the form of any 
physical format, such as the writing or displaying of 
words, the drawing or displaying of images, spoken 
utterances or other forms of sounds, the carrying out 
of actions (e.g., bodily gestures), and so forth. 

Software Architecture: Rather like the archi- 
tecture of a building, software architecture de- 
scribes the principled, structural design of computer 
software. Contemporary software architectures are 
multitier (or n-tier) in nature. Essentially, these stem 
from a two-tier architecture in which user-presenta- 
tion components are separated from the informa- 
tion-content components, hence the two overall 
tiers. Communication occurs through a standard 
interface between the tiers. This enables the easy 
swapping in and out of presentation components, 
thus enabling information to be encoded into the 
most appropriate physical form for a given user at 
any given time. 
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Usability: A measure of how well someone can 
use something. Usability, in comparison to accessi- 
bility, looks at factors such as ease of use, efficiency, 
effectiveness, and accuracy. It concentrates on 



factors of an interaction other than whether some- 
one can perform something, access information, and 
so forth, which are all handled by accessibility. 
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INTRODUCTION 

Computers can be a source of tremendous benefit 
for those with motor impairments. Enabling com- 
puter access empowers individuals, offering im- 
proved quality of life. This is achieved through 
greater freedom to participate in computer-based 
activities for education and leisure, as well as in- 
creased job potential and satisfaction. 

Physical impairments can impose barriers to ac- 
cess to information technologies. The most preva- 
lent conditions include rheumatic diseases, stroke, 
Parkinson’ s disease, multiple sclerosis, cerebral palsy, 
traumatic brain injury, and spinal injuries or disor- 
ders. Cumulative trauma disorders represent a fur- 
ther significant category of injury that may be spe- 
cifically related to computer use. See Kroemer 
(2001) for an extensive bibliography of literature in 
this area. 

Symptoms relevant to computer operation in- 
clude joint stiffness, paralysis in one or more limbs, 
numbness, weakness, bradykinesia (slowness of 
movement), rigidity, impaired balance and coordina- 
tion, tremor, pain, and fatigue. These symptoms can 
be stable or highly variable, both within and between 
individuals. In a study commissioned by Microsoft, 
Forrester Research, Inc. (2003) found that one in 
four working-age adults has some dexterity diffi- 
culty or impairment. Jacko and Vitense (2001) and 
Sears and Young (2003) provide detailed analyses of 
impairments and their effects on computer access. 

There are literally thousands of alternative de- 
vices and software programs designed to help people 
with disabilities to access and use computers (Alli- 
ance for Technology Access, 2000; Glennen & 
DeCoste, 1997; Lazzaro, 1995). This article de- 
scribes access mechanisms typically used by indi- 
viduals with motor impairments, discusses some of 



the trade-offs involved in choosing an input mecha- 
nism, and includes emerging approaches that may 
lead to additional alternatives in the future. 



BACKGROUND 

There is a plethora of computer input devices avail- 
able, each offering potential benefits and weak- 
nesses for motor-impaired users. 

Keyboards 

The appeal of the keyboard is considerable. It can be 
used with very little training, yet experts can achieve 
input speeds far in excess of handwriting speeds 
with minimal conscious effort. Their potential for 
use by people with disabilities was one of the factors 
that spurred early typewriter development (Cooper, 
1983). 

As keyboards developed, researchers investi- 
gated a number of design features, including key size 
and shape, keyboard height, size, and slope, and the 
force required to activate keys. Greenstein and 
Arnaut (1987) and Potosnak( 198 8) provide summa- 
ries of these studies. 

Today, many different variations on the basic 
keyboard theme are available (Lazzaro, 1996), in- 
cluding the following. 

• Ergonomic keyboards shaped to reduce the 
chances of injury and to increase comfort, 
productivity, and accuracy. For example, the 
Microsoft® Natural Keyboard has a convex 
surface and splits the keys into two sections, 
one for each hand, in order to reduce wrist 
flexion for touch typists. The Kinesis® Ergo- 
nomic Keyboard also separates the layout into 
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right- and left-handed portions, but has a con- 
cave surface for each hand designed to minimise 
the digit strength required to reach the keys and 
to help the hands maintain a flat, neutral posi- 
tion. 

• Oversized keyboards with large keys that are 
easier to isolate. 

• Undersized keyboards that require a smaller 
range of movement. 

• One-handed keyboards shaped for left- or right- 
handed operation. These may have a full set of 
keys, or a reduced set with keys that are 
pressed in combinations in the same way a 
woodwind instrument is played. 

• Membrane keyboards that replace traditional 
keys with flat, touch-sensitive areas. 

For some individuals, typing accuracy can be 
improved by using a key guard. Key guards are 
simply attachments that fit over the standard key- 
board with holes above each of the keys. They 
provide a solid surface for resting hands and fingers 
on, making them less tiring to use than a standard 
keyboard for which the hands are held suspended 
above. They also reduce the likelihood of accidental, 
erroneous key presses. Some users find that key 
guards improve both the speed and accuracy of their 
typing. Others find that key guards slow down their 
typing (McCormack, 1990), and they can make it 
difficult to see the letters on the keys (Cook & 
Flussey, 1995). 

The Mouse 

A mouse is a device that the user physically moves 
across a flat surface in order to produce cursor 
movement on the screen. Selection operations are 
made by clicking or double clicking a button on the 
mouse, and drag operations are performed by hold- 
ing down the appropriate button while moving the 
mouse. Because the buttons are integrated with the 
device being moved, some people with motor impair- 
ments experience difficulties such as unwanted 
clicks , slipping while clicking, or dropping the mouse 
button while dragging (Trewin & Pain, 1999). Trem- 
ors, spasms, or lack of coordination can cause 
difficulties with mouse positioning. 



Trackball 

Trackballs offer equivalent functionality to a mouse, 
but are more straightforward to control. This device 
consists of a ball mounted in a base. The cursor is 
moved by rolling the ball in its casing, and the speed 
of movement is a function of the speed with which 
the ball is rolled. Buttons for performing click and 
double-click operations are positioned on the base, 
which makes it easier to click without simulta- 
neously moving the cursor position. For dragging, 
some trackballs require a button to be held down 
while rolling the ball, while others have a specific 
button that initiates and terminates a drag operation 
without needing to be held down during positioning. 

Thumb movement is usually all that is required to 
move the cursor to the extremities of the screen, as 
compared to the large range of skills necessary to 
perform the equivalent cursor movement with a 
mouse. 

Joystick 

The joystick is a pointing device that consists of a 
lever mounted on a base. The lever may be grasped 
with the whole hand and have integrated buttons, or 
may be operated with the fingers, with buttons 
mounted on the base. The cursor is moved by moving 
the lever in the desired direction. When the lever is 
released, it returns to its original, central position. Of 
most relevance are models in which the cursor 
moves at a fixed or steadily accelerating rate in the 
direction indicated by lever movement and retains its 
final position when the lever is released. The buttons 
are often located on the base of such models, and a 
drag button is generally included since it is difficult 
to hold down a button while moving the lever with a 
single hand. 

Isometric Devices 

Isometric devices measure force input rather than 
displacement. An example is the TrackPoint device 
supplied with IBM laptops: a small red button located 
in the center of the keyboard. These devices do not 
require any limb movement to generate the input, 
only muscle contractions. As it has been postulated 
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that some spasticity in particular is brought on by limb 
movement, these devices offer a means of avoiding 
that. 

Studies performed using isometric joysticks (Rao, 
Rami, Rahman, & Benvunuto, 1997), and an adapted 
Spaceball Avenger (Stapleford & Maloney, 1997), 
have shown that the ability to position the cursor is 
improved by using isometric devices. 

Touch Pad 

The touch pad is a flat, touch-sensitive surface 
representing all or part of the screen. The cursor is 
moved by sliding a finger across the surface in the 
desired direction. It requires only a small range of 
motion. Buttons for object selection are located near 
the touch surface, and a drag button may or may not 
be available. 

Switch Input 

The most physically straightforward input device to 
operate is a single switch. Switches can come in 
many different formats (Lazzaro, 1995) and be acti- 
vated by hand, foot, head, or any distinct, controlled 
movement. There are also mouth switches, operated 
by tongue position or by sudden inhalation or exhala- 
tion (sip-puff switches). If a user is capable of 
generating several of these motions independently, 
then it is possible to increase the number of switches 
to accommodate this and increase the information 
transfer bandwidth. 

Given their cheapness and the relatively low level 
of movement required to operate them, switches 
have become extremely popular as the preferred 
method of input for the more severely impaired users. 
However, they do have drawbacks. 

It is necessary to use switches in conjunction with 
some kind of software adaptation to generate the full 
range of input of a keyboard-mouse combination. 
The most frequently used method for this is scanning 
input. This involves taking a regular array of on- 
screen buttons, be they symbols, letters, or keys, and 
highlighting regions of the screen in turn. The high- 
lighting dwells over that region of the screen for a 
predetermined duration, then moves to another part 
of the screen, dwells there, and so on until the user 



selects a particular region. This region is then 
highlighted in subregions and this continues until a 
particular button is selected. Therefore, this pro- 
cess can involve several periods of waiting for the 
appropriate sections of the screen to be highlighted, 
during which the user is producing no useful infor- 
mation. 

Brewster, Raty, and Kortekangas (1996) report 
that for some users, each menu item must be 
highlighted for as much as five seconds. There has 
been much research on efficient scanning mecha- 
nisms, virtual keyboard layouts, and other ways of 
accelerating scanning input rates (e.g., Brewster et 
al., 1996; Simpson & Koester, 1999). 

A switch can also be used to provide input in 
Morse code. This can be faster than scanning, but 
requires more accurate control of switch timing and 
the ability to remember the codes. 

Head-Motion Transducers 

Head-pointing systems operate by detecting the 
user’s head position and/or orientation using ultra- 
sonic, optical, or magnetic signals, and using that 
information to control the cursor. Nisbet and Poon 
(1998) describe a number of existing systems and 
note them to be easy to use, providing both speed 
and accuracy. The majority of these systems are 
ultrasound based, such as the Logitech 6D mouse 
and the HeadMaster system (Prentke-Romich). 
An alternative system is the HeadMouse (Origin 
Instruments), which involves the user wearing a 
small reflective patch on either the forehead or the 
bridge of a pair of spectacles. 

As with most of the mouse-replacement sys- 
tems, no software-interface modifications are nec- 
essary to access most existing applications. How- 
ever, some kind of switch device is needed to make 
selections. This is often a mouth-mounted sip-puff 
switch as most users of these systems do not have 
sufficiently good arm movement to operate a hand 
switch. 

Learning to use head movements to control the 
cursor can take a little while as there is a lack of 
tactile feedback from the input device, but once 
used to it, users can control the cursor quite suc- 
cessfully. 
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Eye Gaze Input 

A review of these and other input devices in the 
context of wheelchairs and environmental controls 
is presented by Shaw, Loomis, and Crisman (1995). 
Edwards (1995) reports that the most successful 
eye-gaze systems operate by detecting infrared light 
bounced off the user’s retina, but there are still many 
unsolved problems with this technology, such as 
coping with head movements. Today’s eye-gaze sys- 
tems may be either head mounted or remote, can be 
accurate to within 1 cm, and can be used to control 
off-the-shelf applications (e.g., ERICA, http:// 
www.eyeresponse.com/ericasystem.html). 

Speech Recognition Systems 

Speech input is widely touted as the eventual suc- 
cessor to the keyboard, being a natural form of 
human communication. Speech is a potentially very 
good input medium for motor-impaired users, al- 
though speech difficulties accompanying motor im- 
pairments may impede the recognition process. Nisbet 
and Poon (1998) also note that some users have 
reported voice strain from using this input technique. 

Besides the technical difficulties of the actual 
recognition process, environmental considerations 
also have to be addressed. Users with motor impair- 
ments may be self-conscious and wish to avoid 
drawing attention to themselves. An input system 
that involves speaking aloud fails to facilitate this. 
However, there have been cases in which speech 
recognition systems have been found to be a good 
solution. 

Speech recognition systems can be programmed 
to offer verbal cursor control and so can replace both 
the keyboard and mouse in the interaction process. 
However, true hands-free operation is not provided 
in most of today’s products. 



SOFTWARE SUPPORTING PHYSICAL 
ACCESS 

Software programs can be used to alter the behaviour 
of input devices or the input requirements of applica- 
tions. Software modifications can tackle input errors 
by changing an input device’s response to specific 



inputs. They can reduce fatigue by reducing the 
volume of input required, or they can provide alter- 
natives to difficult movements. They can also 
minimise the input required of the user, thus reducing 
effort and opportunities for error. Some examples of 
useful facilities that can be implemented in software 
are the following. 
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• The keyboard and mouse configuration, or the 
way the keyboard or mouse reacts to a given 
input, can be changed. For example, the delay 
before a key starts to repeat can be altered, or 
the cursor can be made to move more slowly 
relative to the mouse. Another option sets the 
computer to ignore repeated key presses within 
a set time less than a particular threshold value. 
This filters the input for tremor cases in which 
the user depresses a key more than once for a 
single character input. Another powerful op- 
tion is Sticky Keys. This registers the pressing 
of keys such as Shift and Control and holds 
them active until another key is pressed. This 
removes the need for the user to operate sev- 
eral keys simultaneously to activate keyboard 
shortcuts, hence simplifying the degree of co- 
ordination demanded for the input. Simple al- 
terations like these can be very effective 
(Brown, 1992; Nisbet & Poon, 1998;Trewin& 
Pain, 1998). 

• For those who do not use a keyboard but have 
some form of pointing device, an on-screen 
keyboard emulator can be used to provide text 
input. On-screen keyboards are built into many 
modern operating systems. Several commer- 
cial versions also exist, such as WiViK. 

• For users who find input slow or laborious, 
macros can be used to perform common se- 
quences of operations with a single command. 
For example, a user who always enters the 
same application and opens the same file after 
logging on to a system could define a macro to 
open the file automatically. Many word-pro- 
cessing packages also include macro facilities 
to allow commonly used text to be reproduced 
quickly. For example, a user could create a 
macro representing his or her address as it 
appears at the top of a letter. 

• When knowledge about the user’ s task is avail- 
able, more advanced typing support can be 
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provided through word prediction. Many word- 
prediction systems have been developed, and a 
number are reviewed in Millar and Nisbet 
(1993). As a user begins to type a word, the 
prediction system offers suggestions for what 
the word might be. If the desired word is 
suggested, the user can choose it with a single 
command. In practice, word-prediction sys- 
tems have been observed to reduce the number 
of keystrokes required by up to 60% (Newell, 
Arnott, Cairns, Ricketts, & Gregor, 1995). 
Newell et al. report that using the PAL word- 
prediction system, some users were able to 
double their input speed. However, studies 
with disabled users have also shown that a 
reduction in keystrokes does not necessarily 
produce an increase in input rate (for more 
detailed summaries, see Horstmann, Koester, 
& Levine, 1994; Horstmann & Levine, 1991). 
Word prediction is most useful for very slow 
typists, particularly switch users. Those who 
type at a rate of greater than around 15 words 
a minute may find that the time spent searching 
the lists of suggestions for the right word is 
greater than the time saved (Millar & Nisbet). 
Nevertheless, faster users may still find word 
prediction helpful in reducing fatigue, reducing 
errors, or improving spelling (Millar & Nisbet). 

Input acceleration and configuration are 
complementary approaches. The former improves 
accuracy and/or comfort while the latter reduces the 
volume of input required, thus increasing the input 
rate. For users with slow input rates, or those who 
tire easily, both techniques can be useful. 

CHOOSING AN APPROPRIATE 
INPUT MECHANISM 

Finding the best input mechanism for a given indi- 
vidual requires the analysis and adjustment of many 
interrelated variables, including the choice of device, 
its position and orientation, the need for physical 
control enhancers such as mouth sticks, and the 
available configuration options of the device itself 
(Cook & Hussey, 1995; Lee & Thomas, 1990). This 
is best achieved through professional assessment. In 



an assessment session, an individual may try several 
devices, each in a number of different positions, with 
different control enhancers. 

Users often prefer to use standard equipment 
whenever possible (Edwards, 1995). Alternative 
input devices can often be slower to use and may not 
provide access to the full functionality of applica- 
tions the user wishes to use (Anderson & Smith, 
1996; Mankoff, Dey, Batra, & Moore, 2002; Shaw 
et al., 1995). Also, skills for standard equipment 
learned at home can be transferred to machines at 
work, college, or other public places. Finally, their 
use does not identify the user as different or dis- 
abled. 

Many less severely impaired users can use stan- 
dard input devices with minor modifications. For 
some people, positioning is very important. Adjust- 
able tables allow keyboard and screen height to be 
adjusted, and this in itself can have a dramatic effect 
on input accuracy. The keyboard tilt can also be 
adjusted. For those who find it tiring to hold their 
arms above the keyboard or mouse, arm supports 
can be fitted to tables. Wrist rests can also provide 
a steadying surface for keyboard and mouse use. 
Some users wear finger or hand splints while others 
use a prodder or head stick to activate keys. 

The potential effectiveness of physical modifica- 
tions to input devices is illustrated by Treviranus, 
Shein, Hamann, Thomas, Milner, and Parnes (1990), 
who describe three case studies of users for whom 
modification of standard pointing devices was re- 
quired. They define the physical demands made by 
direct manipulation interfaces, and the difficulties 
these caused for three users with disabilities. In all 
cases, the final solution involved a combination of 
pointing devices or minor modifications to a standard 
device. 



FUTURE TRENDS 

Clearly, the field of input device technology is an 
evolving one, with new technologies emerging all the 
time. For example, some of the most exciting devel- 
opments in computer input in recent years have been 
in the field of brain-computer interfaces. For re- 
views of recent research progress, see Moore (2003) 
and Wolpaw, Birbaumer, McFarland, Pfurtscheller, 
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and Vaughan (2002). A brain-computer interface is 
a system in which electrical brain activity is mea- 
sured and interpreted by a computer in order to 
provide computer-based control without reliance on 
muscle movement. Contrary to popular opinion, such 
interfaces do not read a person’s thoughts. Instead, 
the person learns to control an aspect of his or her 
brain signals that can be detected and measured. 

Such interfaces represent what may be the only 
possible source of communication for people with 
severe physical impairments such as locked-in syn- 
drome. Brain-computer interfaces have been shown 
to enable severely impaired individuals to operate 
environmental-control systems and virtual keyboards, 
browse the Web, and even make physical move- 
ments (Moore, 2003; Perelmouter & Birbaumer, 
2000; Wolpaw, Birbaumer, et al., 2002). Clinical 
trials are under way for at least one commercial 
system, the BrainGate by Cyberkinetics Inc. (http:/ 
/w ww.cyberkineticsinc.com). 

A related computer-control system already on 
the market, Cyberlink Brainfingers, is a hybrid brain- 
and body-signal transducer consisting of a headband 
that measures brain and muscle activity in the fore- 
head. Information is available from Brain Actuated 
Technologies Inc. (http://www.brainfingers.com). 

Computer input devices will also have to evolve 
to match changes in software user interfaces. For 
example, the next generation of Microsoft’s ubiqui- 
tous Windows operating system will apparently move 
from two-dimensional (2-D) interfaces to three- 
dimensional (3-D) ones. While the dominant output 
technologies, that is, monitors and LCD panels, 
remain two-dimensional, it is likely that 2-D input 
devices such as the mouse will continue to be used. 
However, when three-dimensional output technolo- 
gies become more common, there will be a need to 
migrate to 3-D input devices. If the 3-D outputs are 
genuinely immersive, this may benefit motor-im- 
paired users as the targets could be enlarged, allow- 
ing for larger gross movements for selecting them. 
However, if the outputs remain comparatively small, 
then the difficulties of locating, selecting, and acti- 
vating targets in two dimensions on the screen are 
going to be further compounded by the addition of a 
third dimension. Consequently, the jury is still out on 
whether the move to 3-D will be beneficial or not for 
motor-impaired users. 



CONCLUSION 

Many computer access options are available to 
people with motor impairments. For those individu- 
als who prefer to use standard computer input 
devices, accuracy and comfort can be improved 
through modifications to device positioning, the use 
of control enhancers such as wrist rests, appropriate 
device configuration, and software to accelerate 
input rates. Where standard devices are not appro- 
priate, the above enhancements can be used in 
conjunction with alternative devices such as trackballs 
and head or eye gaze systems. When choosing an 
input setup, professional assessment is highly ben- 
eficial. 

Speech input is useful for some individuals but 
has significant drawbacks. Brain-computer inter- 
faces show great promise, offering hope to individu- 
als with severe physical impairments. 
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Accessibility: A characteristic of information 
technology that allows it to be used by people with 
different abilities. In more general terms, accessibil- 
ity refers to the ability of people with disabilities to 
access public and private spaces. 

Assessment: A process of assisting an indi- 
vidual with a disability in the selection of appropriate 
assistive technology devices and/or configurations 
of standard information technology devices. 

Input Acceleration: Techniques for expanding 
user input, allowing a large volume of input to be 
provided with few user actions. 

Motor Impairment: A problem in body motor 
function or structure such as significant deviation or 
loss. 

Transducer: An electronic device that converts 
energy from one form to another. 
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INTRODUCTION 

A concept map (also known as a knowledge map) 
is a visual representation of knowledge of a domain. 
A concept map consists of nodes representing con- 
cepts, objects, events, or actions connected by direc- 
tional links defining the semantic relationships be- 
tween and among nodes. Graphically, a node is 
represented by a geometric object, such as a rect- 
angle or oval, containing a textual name; relationship 
links between nodes appear as textually labeled lines 
with an arrowhead at one or both ends indicating the 
directionality of the represented relation. Together, 
nodes and links define propositions or assertions 
about a topic, domain, or thing. For example, an 
arrow labeled has beginning at a node labeled bird 
and ending at a wings node represents the proposi- 
tion “A bird has wings” and might be a portion of a 
concept map concerning birds, as portrayed in 
Figure 1. 



BACKGROUND: CONCEPT MAPS AS 
KNOWLEDGE REPRESENTATION 

Representing knowledge in this fashion is similar to 
semantic network knowledge representation from 
the experimental psychology and AI (artificial intel- 
ligence) communities (Quillian, 1968). Some have 
argued that concept maps accurately reflect the 
content of their authors’ knowledge of a domain 
(Jonassen, 1992) as well as the structure of that 
knowledge in the authors’ cognitive system (Ander- 
son-Inman & Ditson, 1999). Indeed, in addition to 
the structured relationships among knowledge ele- 
ments (nodes and links) that appear in a single map, 
some concept mapping tools allow for multiple layer 
maps. The structure of such maps is isomorphic to 
the cognitive mechanisms of abstraction, wherein a 
single node at one level of a map may represent a 
chunk of knowledge that can be further elaborated 
by any number of knowledge elements at a more 



Figure 1. A concept map in the Webster concept mapping tool. Nodes in this concept map portray a 
variety of representational possibilities: A node may contain a textual description of a concept, 
object, event, or action, or may be an image or a link to a Web site, audio, video, spreadsheet, or any 
other application-specific document. 
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detailed level of the overall map (Alpert, 2003). 
Concept maps, thus, can be viewed as knowledge 
visualization tools. 



CONCEPT MAPS AS 
COGNITIVE TOOL 

Concept maps have been used in educational set- 
tings since the early 1970s as both pedagogical and 
evaluation tools in virtually every subject area: read- 
ing and story comprehension, science, engineering, 
math word problems, social studies, and decision 
making (see, e.g., Bromley, 1996; Chase & Jensen, 
1999; Fisher, Faletti, Patterson, Thornton, Lipson, & 
Spring, 1990; Novak, 1998). Concept maps permit 
students to demonstrate their knowledge of a do- 
main; act as organizational and visualization tools to 
aid study and comprehension of a domain, a story, or 
an expository text; and support the generation and 
organization of thoughts and ideas in preparation for 
prose composition. They are also used as instruc- 
tional materials whereby teacher-prepared maps 
present new materials to learners, showing the 
concepts and relationships among concepts of a 
domain new to the students. Concept maps con- 
structed by students help those students to learn and 
exercise the metacognitive practice of reflecting on 
what they know to explain or demonstrate their 
knowledge to others. Such activities may lead to 
self-clarification and elaboration of their knowledge. 
There is considerable anecdotal and experimental 
evidence that the use of graphical knowledge-visu- 
alization tools such as concept maps helps improve 
student comprehension and enhance learning. For 
example, Fisher et al. (1990) have reported that 
providing concept maps constructed by domain ex- 
perts to present new information to learners and 
illustrating how an expert organizes concepts of the 
domain results in demonstrable pedagogical ben- 
efits. Dunston (1992) and Moore and Readance 
(1984) have shown that concept maps are pedagogi- 
cally effective when students create their own maps 
to reflect on and demonstrate their own knowledge. 

In educational environments, the use of concept 
maps has evolved from paper-and-pencil to com- 
puter-based tools. A number of computer-based 
concept-mapping tools have been reported by re- 
searchers (e.g., Alpert & Grueneberg, 2000; Fisher 



et al., 1990; Gaines & Shaw, 1995b; Kommers, 
Jonassen, & Mayes, 1992), and there exist shareware 
programs as well as commercial products for this 
activity (e.g. , Inspiration, 1 Axon, 2 Decision Explorer, 3 
SemNet, 4 SMART Ideas, 5 and the IFIMC 
CmapTools 6 ). With such tools, users using a mouse 
and keyboard can create, position, organize, modify, 
evolve, and store and retrieve the nodes and links 
that comprise concept maps. Concept-mapping soft- 
ware offers the same sorts of benefits that word 
processors provide over composing written works 
on paper. That is, such software facilitates revision 
of existing work, including additions, deletions, modi- 
fications, or reorganizations. In fact, students often 
revisit their existing maps to revise them as their 
knowledge of a subject evolves (Anderson-Inman & 
Zeitz, 1993). 

Computer-based concept mapping tools have 
also been used outside educational settings. In busi- 
ness settings, for example, concept-map tools have 
been used for organizing notes taken during meet- 
ings and lectures, and for the preparation of presen- 
tations and written works. There have also been 
efforts to use concept maps as organizing vehicles 
for both designers and end users of Web sites and 
other hypermedia environments (e.g., Gaines & 
Shaw, 1995a; Zeiliger, Reggers, & Peeters, 1996). 
In this context, concept maps have provided visual- 
izations of the structure of the pages, documents, or 
resources of a site and the hyperlink relationships 
among them, as well as a mechanism for directly 
navigating to specific pages. 



c 



FUTURE TRENDS 

More recently, concept-mapping tools have been 
enhanced to enable the representation of informa- 
tion or knowledge that is neither textual nor propo- 
sition based. In many tools, for example, a node may 
be an image rather than a geometric shape with an 
embedded textual description. In the Inspiration 
product, nodes in a concept map may also reference 
media files, such as video and audio, and application- 
specific files, such as spreadsheet or presentation 
documents. The Webster knowledge mapping tool 
(Alpert & Grueneberg, 2000) offers a Web-enabled 
version of these facilities, in which nodes in a 
concept map may reference any media that can be 



101 




Computer-Based Concept Mapping 



presented or played in a Web browser (including 
Web-based tools and applications, such as 
Flash™interactive programs). As new sense-based 
resources, such as tactile, haptic, and aroma-based 
media, become available on inexpensive computers, 
concept maps should also be capable of incorporating 
nodes representing such information for end-user 
navigation and playback with the goal of having 
concept maps comprehensively represent knowl- 
edge of a domain (Alpert, 2005). 

Concept-map tools have now integrated other 
Web facilities as well, such as nodes that act as 
hyperlinks to Web sites. This innovation allows con- 
cept maps to incorporate the vast amount of knowl- 
edge and information available on the World Wide 
Web. For learning, users need content organized in 
some fashion, focused on a particular topic or do- 
main. Rather than a generic search engine to, hope- 
fully, find relevant content and its resulting flat view 
of information, a concept map provides a centralized 
place to access knowledge and information. Such a 
tool visually organizes relevant content in lucidly 
structured ways while providing semantic links be- 
tween knowledge and information elements. Con- 
cept maps can thereby help students by imposing 
order on the perhaps overwhelming amounts and 
complexity of information germane to a domain. This 
can be especially useful when that information is 
distributed across numerous locations on the Web. 
Concept maps can thereby serve as personal knowl- 
edge management tools for students and other knowl- 
edge workers. 

Concept maps are also emerging as visualization 
tools for the nascent area of topic maps. Topic maps 
are used to organize, for end-user navigation, re- 
sources available on the Web that are germane to a 
particular domain of interest and/or multiple topically 
related domains. A topic map is defined as a model 
“for representing the structure of information re- 
sources used to define topics, and the associations 
(relationships) between topics” (Pepper & Moore, 
2001). Thus, the conceptual connection to concept 
maps is obvious. Topic maps are consistent with the 
ideas expressed above regarding knowledge man- 
agement: The value of a topic map is that it organizes 
for the user, in a single location, resources that reside 
in multiple disparate locations on the Web. Flowever, 
topic maps are defined using a textual language and 



typically presented to end users textually as well. 
The XML (extensible markup language) Topic Maps 
(XTM) specification is an ISO/IEC standard that 
defines an XML-based language for defining and 
sharing Web-based topic maps (International Orga- 
nization for Standardization & International 
Electrotechnical Commission, 2002). But the speci- 
fication does not specify or suggest tools for the 
visualization of a map’s topics, Web-based re- 
sources, or their interrelationships. In practice to 
date, topic maps are often displayed for users using 
a textual (plus hyperlink) format. However, several 
developers are beginning to apply the concept map 
visualization model to provide users with a graphical 
mechanism for both understanding topic maps and 
navigating to specific pages contained therein (e.g., 
HyperGraph 7 and Mondeca 8 ). 

CONCLUSION 

Concept maps are a form of knowledge represen- 
tation and knowledge visualization portraying knowl- 
edge and information about a domain of interest in 
a visual and organized fashion. Concept maps have 
evolved from paper-and-pencil tools, to computer- 
based text-only applications, to computer-based 
tools that permit the incorporation of any sort of 
knowledge or information source available in any 
computational form. As new forms of sensory 
output become available in digital form, concept 
maps should provide facilities for the inclusion of 
such media in order to fully represent and share 
knowledge of any domain. Concept maps have 
been used as cognitive tools, especially in educa- 
tional settings, for organizing thoughts and ideas, for 
knowledge elicitation, and for learning. 
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KEY TERMS 

Concept Map : A visual representation of knowl- 
edge of a domain consisting of nodes representing 
concepts, objects, events, or actions interconnected 
by directional links that define the semantic relation- 
ships between and among nodes. 

Knowledge Management: Organizing knowl- 
edge, information, and information resources and 
providing access to such knowledge and information 
in such a manner as to facilitate the sharing, use, 
learnability, and application of such knowledge and 
resources. 
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Knowledge Map: See Concept Map. 

Knowledge Representation: According to 
Brachman and Levesque (1985, p. xiii), it is “writing 
down, in some language or communicative medium, 
descriptions or pictures that correspond. ..to the 
world or a state of the world.” 

Knowledge Visualization: A visual (or other 
sense-based) representation of knowledge; a por- 
trayal via graphical or other sensory means of 
knowledge, say, of a particular domain, making that 
knowledge explicit, accessible, viewable, scrutable, 
and shareable. 

Proposition: A statement, assertion, or declara- 
tion formulated in such a manner that it can be judged 
to be true or false. 

Semantic Network: A knowledge-representa- 
tion formalism from the cognitive-science commu- 
nity (understood by cognitive psychologists to repre- 
sent actual cognitive structures and mechanisms, 
and used in artificial-intelligence applications) con- 
sisting primarily of textually labeled nodes repre- 
senting objects, concepts, events, actions, and so 
forth, and textually labeled links between nodes 
representing the semantic relationships between 
those nodes. 
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INTRODUCTION 

In computer-supported collaborative learning 
(CSCL), information and communication technolo- 
gies are used to promote connections between one 
learner and other learners, between learners and 
tutors, and between a learning community and its 
learning resources. CSCL is a coordinated, synchro- 
nous activity of a group of learners resulting from 
their continued attempt to construct and maintain a 
shared conception of a problem (Roschelle & 
Teasley, 1995). 

CSCL systems offer software replicas of many 
of the classic classroom resources and activities 
(Soller, 2001). For example, such systems may 
provide electronic shared workspaces, on-line pre- 
sentations, lecture notes, reference material, quiz- 
zes, student evaluation scores, and facilities for chat 
or online discussions. This closely reflects a typical 
collaborative learning situation in the classroom, 
where the learners participating to learning groups 
encourage each other to ask questions, explain and 
justify their opinions, articulate their reasoning, and 
elaborate and reflect upon their knowledge, thereby 
motivating and improving learning. 

These observations stipulate both the social con- 
text and the social processes as an integral part of 
collaborative learning activities. In other words, 
CSCL is a natural process of social interaction and 
communication among the learners in a group while 
they are learning by solving common problems. 

BACKGROUND 

Theory 

Collaborative learning is studied in many learning 
theories, such as Vygotsky’s socio-cultural theory — 
zone of proximal development (Vygotsky, 1978), in 
constructivism, self-regulated learning, situated cog- 



nition, cognitive apprenticeship, cognitive flexibility 
theory, observational learning, distributed cognition, 
and many more (see Andriessen, Baker, & Slithers, 
2003; Dillenbourg, Baker, Blaye, & O’Malley, 1996; 
Roschelle & Teasley, 1995; TIP, 2004, for a more 
comprehensive insight). A number of researchers 
have shown that effective collaboration with peers is 
a successful and powerful learning method — see, 
for example Brown and Palincsar (1989), Doise, 
Mugny, and Perret-Clermont (1975), Dillenbourg et 
al. (1996), and Soller (2001). However, there is an 
important prerequisite for collaborative learning to 
result in improved learning efficiency and bring other 
learning benefits — the group of learners must be 
active and well-functioning. Just forming a group 
and placing the students in it does not guarantee 
success. The individual learners’ behaviour and 
active participation is important, and so are their 
roles in the group, their motivation, their interaction, 
and coordination. Soller (2001) makes an important 
observation that “while some peer groups seem to 
interact naturally, others struggle to maintain a bal- 
ance of participation, leadership, understanding, and 
encouragement.” 

One should differentiate between cooperative 
and collaborative learning. In cooperative learning, 
the learning task is split in advance into sub-tasks 
that the partners solve independently. The learning is 
more directive and closely controlled by the teacher. 
On the other hand, collaborative learning is based on 
the idea of building a consensus through cooperation 
among the group members; it is more student- 
centered than cooperative learning. 

The Goals of CSCL 

The goals of CSCL are three-fold: 

• Personal: By participating in collaborative 
learning, the learner attains elimination of mis- 
conceptions, more in-depth understanding of 
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the learning domain, and development of self- 
regulation skills (i.e., metacognitive skills that 
let the learner observe and diagnose his/her 
self-thinking process and self-ability to regu- 
late or control self-activity); 

• Support Interaction: Maintaining interaction 
with the other learners, in order to attain the 
personal goal associated with the interaction; 
this leads to learning by self-expression (learn- 
ing by expressing self-thinking process, such as 
self-explanation and presentation), and learn- 
ing by participation (learning by participating as 
an apprentice in a group of more advanced 
learners); 

• Social: The goals of the learning group as a 
whole are setting up the situation for peer 
tutoring (the situation to teach each other), as 
well as setting up the situation for sharing 
cognitive or metacognitive functions with other 
learners (enabling the learners to express their 
thinking/cognitive process to other learners, to 
get advise from other learners, discuss the 
problem and the solution with the peers, and the 
like). 

Web-Based Education 

Web-based education has become a very important 
branch of educational technology. For learners, it 
provides access to information and knowledge 
sources that are practically unlimited, enabling a 
number of opportunities for personalized learning, 
tele-learning, distance-learning, and collaboration, 
with clear advantages of classroom independence 
and platform independence. On the other hand, 
teachers and authors of educational material can use 
numerous possibilities for Web-based course offer- 
ing and teleteaching, availability of authoring tools 
for developing Web-based courseware, and cheap 
and efficient storage and distribution of course ma- 
terials, hyperlinks to suggested readings, digital li- 
braries, and other sources of references relevant for 
the course. 

Adaptivity and Intelligence 

Typically, an adaptive educational system on the 
Web collects some data about the learner working 



with the system and creates the learner model 
(Brusilovsky, 1999). Further levels of adaptivity are 
achieved by using the learner model to adapt the 
presentation of the course material, navigation 
through it, its sequencing, and its annotation, to the 
learner. Furthermore, collaborative Web-based edu- 
cational systems use models of different learners to 
form a matching group of learners for different kinds 
of collaboration. This kind of adaptivity is called 
adaptive collaboration support. Alternatively, such 
systems can use intelligent class monitoring to com- 
pare different learner models in order to find signifi- 
cant mismatches, for example, to identify the learn- 
ers who have learning records essentially different 
from those of their peers. These learners need 
special attention from the teacher and from the 
system, because their records may indicate that they 
are progressing too slow, or too fast, or have read 
much more or much less material than the others, or 
need additional explanations. 

Intelligence in a Web-based educational system 
nowadays usually means that the system is capable 
of demonstrating some form of knowledge-based 
reasoning in curriculum sequencing, in analysis of 
the learner’s solutions, and in providing interactive 
problem-solving support (possibly example-based) 
to the learner. Most of these intelligent capabilities 
exist in traditional intelligent tutoring systems as 
well, and were simply adopted in intelligent Web- 
based educational applications and adapted to the 
Web technology. 

CSCL Model 

CSCL technology is not a panacea. Learners who 
use it need guidance and support online, just as 
students learning in the classroom need support from 
their instructor. Flence, developers of CSCL tools 
must ensure that collaborative learning environ- 
ments support active online participation by remote 
teachers, as well as a variety of means for the 
learners to deploy their social interaction skills to 
collaborate effectively. 

In order for each CSCL system to be effective, 
it must be based on a certain model, such as the one 
suggested by Soller (2001) that integrates the fol- 
lowing four important issues: 
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• indicators of effective collaborative learning; 

• strategies for promoting effective peer interaction; 

• technology (tools) to support the strategies and; 

• a set of criteria for evaluating the system. 

CSCL system should recognize and target group 
interaction problem areas. It should take actions to 
help the learners collaborate more effectively with 
their peers, improving individual and group learning. 

Figure 1 shows the typical context of CSCL on the 
Web. A group of learners typically uses a CSCL 
system simultaneously. The system runs on one or 
more educational servers. The learners’ activities 
are focused on solving a problem in the CSCL system 
domain collaboratively . A human teacher can partici- 
pate in the session too, either by merely monitoring 
the learners’ interactions and progress in solving 
problems, or by taking a more active role (e.g., 
providing hints to the learners, suggesting modes of 
collaboration, discussing the evolving solution, and so 
on). Intelligent pedagogical agents provide the nec- 
essary infrastructure for knowledge and information 
flow between the clients and the servers. They 
interact with the learners and the teachers and col- 
laborate with other similar agents in the context of 
interactive learning environments (Johnson, Rickel, 
& Lester, 2000). Pedagogical agents very much help 
in locating, browsing, selecting, arranging, integrat- 
ing, and otherwise using educational material from 
different educational servers. 



MAIN ISSUES IN CSCL 
The Types of Interaction 

Since the issue of interaction is central to CSCL, it 
is useful to introduce the types of interaction the 
learner typically meets when using such systems 
(Curtis & Lawson, 2001): 

• interaction with resources (such as related 
presentations and digital libraries); 

• interaction with teachers (teachers can par- 
ticipate in CSCL sessions); 

• interaction with peers (see the above descrip- 
tion of the goals of CSCL) and; 

• interaction with interface (this is the most 
diverse type of interaction, ranging from lim- 
ited text-only interactions, to the use of spe- 
cific software tools for dialogue support, based 
on dialogue interaction models, to interaction 
with pedagogical agents [see Figure 1]). 

The Kinds of CSCL 

Starting from the theory of collaborative learning 
and applying it along with AI techniques to CSCL 
systems on the Web, the research community has 
made advances in several directions related to 
collaboration in learning supported by Web tech- 
nologies: 



Figure 1. The context of Web-based CSCL 




Learner 
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• Classical CSCL: This comprises setting lip 
CSCL in Web classrooms, as well as infra- 
structure for CSCL in distance learning; 

• Learning Companions: These are artificial 
learners, for example, programs that help human 
learners learn collaboratively if they want so, 
even when no other peer learners are around; 

• Learning Communities: Remote learners 
can communicate intensively not only by solv- 
ing a problem in a group, but also by sharing 
common themes, experiences, opinions, and 
knowledge on the long run; 

• Web Services: This general and extremely 
popular recent technology is nowadays used in 
learning situations as well; 

• Hybrid Modes: Some, or even all of the above 
capabilities can be supported (at least to an extent) 
in an intelligent Web-based CSCL system. 

Design Issues 

It is quite understood that the learning process is 
more effective if the user interface is designed to be 
intuitive, easy-to-use, and supportive in terms of the 
learners’ cognitive processes. With CSCL systems, 
additional flexibility is required. The learners have to 
work collaboratively in a shared workspace environ- 
ment, but also use private workspaces for their own 
work. Moreover, since work/learning happens in 
small groups, the interface should ideally support the 
group working in one environment, or in synchronous 
shared environments. It also must support sharing of 
results, for example, exchanging settings and data 
between the groups and group members, as well as 
demonstrating the group’s outcomes or conclusions. 
A suitable way to do it is by using a public workspace. 

This division of learning/work into shared and 
private workspaces leads to the idea of workspaces 
that can contain a number of transparent layers 
(Pinkwart, 2003). The layers can have “solid” ob- 
jects (synchronizeable visual representations), that 
is, handwriting strokes or images. Also, the layers 
can be private or shared, for example, a private 
handwriting layer used for personal annotations. 

Group Formation 

If for any reason a learner wants to participate in 
collaborative learning on the Web, the learning effi- 



ciency depends on joining an appropriate learning 
group. Hence the question, “How to form a group?” 
is important. 

Opportunistic group formation (OGF) is a frame- 
work that enables pedagogical agents to initiate, 
carry out, and manage the process of creating a 
learning group when necessary and conducting the 
learner’s participation to the group. Agents in OGF 
support individual learning, propose shifting to col- 
laborative learning, and negotiate to form a group of 
learners with appropriate role assignment, based on 
the learners’ information from individual learning. 

In OGF, collaborative learning group is formed 
dynamically. A learner is supposed to use an intelli- 
gent, agent-enabled Web-based learning environ- 
ment. When an agent detects a situation for the 
learner to shift from individual to collaborative learn- 
ing mode (a “trigger,” such as an impasse or a need 
for review), it negotiates with other agents to form 
a group. Each group member is assigned a reason- 
able learning goal and a social role. These are 
consistent with the goal for the whole group. 

APPLICATIONS 

Two practical examples of CSCL systems described 
in this section illustrate the issues discussed earlier. 



COLER 

COLER is an intelligent CSCL system for learning 
the principles of entity-relationship (ER) modelling 
in the domain of databases (Constantino-Gonzalez, 
Suthers, & Escamilla de los Santos, 2003). The 
learners using the system through the Web solve a 
specific ER modelling problem collaboratively . They 
see the problem’s description in one window and 
build the solution in another one, which represents a 
shared workspace. Each learner also has his/her 
own private workspace in which he/she builds his/ 
her own solution and can compare it to the evolving 
group solution in the shared workspace. A learner 
can invoke a personal coach (an intelligent peda- 
gogical agent) to help him/her solve the problem and 
contribute to the group solution. In addition to such 
guidance, there is also a dedicated HELP button to 
retrieve the basic principles of ER modelling if 
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necessary. At any time during the problem solving, 
a learner can see in a specially-designated panel on 
the screen which team-mates are already connected 
and can ask for floor control. When granted control, 
the learner contributes to the group solution in the 
shared workspace by, for example, inserting a new 
modelling element. He/she can also express feelings 
about the other team-mates’ contributions through a 
designated opinion panel, and can also engage in 
discussion with them and with the coach through a 
chat communication window. 

Cool Modes 



work. The objects have a specified domain-related 
functionality and semantics, enriched with rules and 
interpretation patterns. Technically, Cool Modes is 
integrated with visual modelling languages and has a 
set of domain-specific palettes of such objects (see 
the palette on the right-hand side of Figure 2). The 
palettes are defined externally to encapsulate do- 
main-dependent semantics and are simply plugged- 
in the system when needed. Currently, the system 
has palette support for modelling stochastic pro- 
cesses, system dynamics, Petri nets, and other engi- 
neering tools, as well as for learning Java. 



c 



The idea of using transparent layers in the design of 
user interface is best exemplified in the intelligent 
Web-based CSCL called Cool Modes (Collaborative 
Open Learning, Modelling and DEsigning System) 
(Pinkwart, 2003). The system supports the Model 
Facilitated Learning (MFL) paradigm in different 
engineering domains, using modelling tools, con- 
struction kits, and system dynamics simulations. The 
focus of the learning process is on the transforma- 
tion of a concrete problem into an adequate model. 
The shared workspace, Figure 2, is public and looks 
the same to all the learners in the group. Flowever, 
the handwritten annotations are placed in private 
layers and can be seen only by individual learners. 
Cool Modes also provides “computational objects to 
think with” in a collaborative, distributed frame- 



FUTURE TRENDS 

Open Distributed 
Learning Environments 

There is an important trend in software architec- 
tures for CSCL — open distributed learning envi- 
ronments (Muehlenbrock & Floppe, 2001). The idea 
here is that learning environments and support sys- 
tems are not conceived as self-containing, but as 
embedded in realistic social and organizational envi- 
ronments suitable for group learning, such as Web 
classrooms. Different Web classrooms can be inter- 
connected among themselves letting the learners 
communicate with the peers and teachers not physi- 
cally present in the same classroom, but logged onto 



Figure 2. A screenshot from Cool Modes (after Pinkwart, 2003) 
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the same network/application. Moreover, the CSCL 
application can link the learners with other relevant 
learning resources or remote peers and tutors through 
the Internet. 

CSCL and the Semantic Web 

Semantic Web (SemanticWeb.org, 2004) is the new- 
generation Web that makes possible to express 
information in a precise, machine-interpretable form, 
ready for software agents to process, share, and 
reuse it, as well as to understand what the terms 
describing the data mean. It enables Web-based 
applications to interoperate both on the syntactic and 
semantic level. The key components of the Semantic 
Web technology are ontologies of standardized ter- 
minology that represent domain theories ; each ontol- 
ogy is a set of knowledge terms, including the 
vocabulary, the semantic interconnections, and some 
simple rules of inference and logic for some particu- 
lar topic. 

At the moment, educational ontologies are still 
scarce — developing ontologies of high usability is 
anything but easy, and the Semantic Web is around 
for j ust a couple of years. Still, CSCL community has 
already ventured in developing CSCL ontologies. In 
their pioneering but extremely important work, 
Supnithi, Inaba, Ikeda, Toyoda, and Mizoguchi (1999) 
have made a considerable effort towards developing 
the collaborative learning ontology (CLO). Al- 
though still not widely used, CLO clarifies the con- 
cepts of a collaborative learning group, and the 
relations among the concepts. It answers general 
questions like: 

• What kinds of groups exist in collaborative 
learning? 

• Who is suitable for attaining the group? 

• What roles should be assigned to the members? 

• What is the learning goal of the whole group? 

CONCLUSION 

This article surveyed important issues in CSCL in 
the context of intelligent Web-based learning envi- 
ronments. Current intelligent Web-based CSCL sys- 
tems integrate a number of Internet and artificial 
intelligence technologies. This is not to say that 



learning theories and instructional design issues 
should be given lower priority than technological 
support. On the contrary, new technology offers 
more suitable ways for implementing and evaluating 
instructional expertise in CSCL systems. 
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KEY TERMS 

Adaptive Collaboration Support in CSCL: 

Using models of different learners to form a match- 
ing group of learners for different kinds of collabo- 
ration. 

Computer-Supported Collaborative Learn- 
ing (CSCL): The process related to situations in 
which two or more subjects build synchronously and 
interactively a joint solution to some problem 
(Dillenbourg, 1999; Dillenbourg et al., 1996; 
Dillenbourg & Schneider, 1995). 

Group Formation: The process of creating a 
suitable group of learners to increase the learning 
efficiency for both the individual peers and the group 
as a whole. 

Pedagogical Agents: Autonomous software 
entities that support human learning by interacting 
with learners and teachers and by collaborating with 
other similar agents, in the context of interactive 
learning environments (Johnson et al., 2000). 

Private Workspace: Part of the CSCL system, 
usually represented as a designated window in the 
system’s GUI, where a member of the learning 
group builds his/her own solution of the problem the 
group solves collaboratively, and where he/she can 
also take notes, consider alternative solutions, and 
prepare contributions to the group solution. 

Shared Workspace: Part of the CSCL system, 
usually represented as a designated window in the 
system’s GUI, where the members of the learning 
group build the joint solution to the problem they 
solve collaboratively. 
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INTRODUCTION 

The study of conceptual models is both a complex 
and an important field within the HCI domain. Many 
of its key principles resulted from research and 
thinking carried out in the 1980s, arguably in the 
wake of Norman (1983). Since then, the importance 
of conceptual models in affecting the usability of an 
Information and Communication Technology (ICT) 
system has become well-established (e.g., they fea- 
ture prominently in the widely cited design guidelines 
for interfaces defined by Norman [1988], which are 
summarized in Figure 1 by Lienard [2000]). 

Today, most HCI professionals are able to at- 
tribute significant meaning to the term conceptual 
model and to recognize its importance in aiding 
usability. However, two problems seem to prevail. 
First, some HCI researchers and practitioners lack 
a precise understanding of conceptual models (and 
related ideas), and how they affect usability. Sec- 
ond, much of the research in this field is (necessar- 
ily) abstract in nature. In other words, the study of 
conceptual models is itself highly conceptual, with 
the result that practitioners may find some of the 
theory difficult to apply. 



This article is designed to help both researchers 
and practitioners to better understand the nature of 
conceptual models and their role in affecting usabil- 
ity. This includes explaining and critiquing both 
contemporary and (possible) future approaches to 
leveraging conceptual models in the pursuit of im- 
proved usability. 

BACKGROUND 

Key to understanding the role of conceptual models 
in promoting usability are clear definitions of these 
terms, related ideas, and their appropriate 
contextualization within the HCI domain. 

Definitions of Usability 

Probably the first widely cited definition of usabil- 
ity, as it applies to ICT systems, was established by 
Shackel (1991) and is shown in Figure 2. 

The definition provided by Shackel (1991) is 
reasonably comprehensive, which is one reason it 
remains useful today. However, a more concise 



Figure 1. Design guidelines for interfaces defined by Norman (1988), summarized by Lienard (2000) 



A. Good visibility means you can: 

• tell the state of the system by looking at it 

• tell what the alternatives for actions are 

• identify controls to make the system perform the 
available actions 


+ 


B. Good conceptual models provide: 

• consistent presentation of the system’s state 

• consistent controls, possible actions, and results 


System image = 


C. Good mappings mean you can determine the: 

• relationship between actions and results 

• relationship between controls and actions 

• system state from what is visible 


User’s model of system 


D. Good feedback involves: 

• full and continuous presentation of the results of 
actions 

• timely (i.e., rapid) response times 


A “good” user model makes 
the user feel: 

• in control of the system 

• confident of getting the 
required result(s) 
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Figure 2. Definition of usability by Shackel (1991) 



Effectiveness 

Improvement in task performance in terms of speed and/or error rate by a given 
percentage of the population within a given range of the user’s tasks (related to 
the user’s environment) 

Learnability 

Within some specified time from commissioning and start of user training based 
upon some specified amount of user support and within some specified relearning 
time each time for intermittent users 

Flexibility 

With flexibility allowing adaptation to some specified percentage variation in task 
and/or environments beyond those first specified 

Attitude 

Within acceptable levels of human cost in terms of tiredness, discomfort, 
frustration and personal effort, so that satisfaction causes continued and enhanced 
usage of the system 



c 



definition was established in ISO 924 1 - 1 1 : 1 998 and 
is summarized by Maguire (1998): 

• Effectiveness: How well the user achieves 
the goals he or she sets out to achieve using the 
system. 

• Efficiency: The resources consumed in order 
to achieve his or her goals. 

• Satisfaction: How the user feels about his use 
of the system. 

These definitions are widely cited. However, the 
ISO 9241-11:1998 arguably has superseded that of 
Shackel (1991) and is, therefore, used throughout 
the remainder of this article. 

Definitions of a Conceptual Model 

The word model implies an abstraction of the sub- 
ject matter, or artefact, being modeled. This is true 
whether that artefact is an ICT system, a motor- 
cycle, or a house. Inevitably, a model lacks the full 
detail present within an actual artefact, so, in pro- 
ducing the model, some properties of the artefact 
will be ignored or simplified. The particular abstrac- 
tion will depend on the (intended) use of the model 
(e.g., a technical drawing of a motorcycle used in its 
manufacture abstracts different properties from that 
of an artist’s sketch used in a sales brochure). 
Similarly, a usability engineer may model only those 
properties of an ICT system concerned with its 
interface, while a technical architect might model in 
terms useful to coding the system. In both cases, the 
subject matter is common, yet the abstractions and 
resulting models are very different. 



The word conceptual stems from the word 
concept. This also implies some form of abstraction 
and, hence, a model. In psychology-oriented fields, 
this term may be used synonymously with the word 
idea and, therefore, has connotations relating to 
cognition, perception, innovation, and, most impor- 
tantly, models stored in the mind. Alternatively, in 
(product) design-oriented fields, a conceptual model 
is more likely to be interpreted as an abstraction 
concerned only with the key or fundamental proper- 
ties of an artefact (i.e., a model considerably lacking 
detail). Further, such models typically are expressed 
in concrete terms (e.g., a designer’s sketch, clay 
model, or engineer’s prototype). 

The HCI domain incorporates principles related 
to both psychology and (product) design (an ICT 
system is a product). Similarly, both of the two 
(overlapping) definitions of a conceptual model pre- 
sented have relevance here. 

Mental Models 

More in keeping with a (product) design-oriented 
view of conceptual models, we might define and 
express the conceptual model of an ICT system in 
concrete terms, using devices such as storyboards 
and Entity Relationship Diagrams (ERDs). How- 
ever, these conceptual models only can be utilized 
once inside our minds (i.e., once converted into a 
mental model). Indeed, most cognitive scientists 
agree that our entire perception of the world, includ- 
ing ourselves, is constructed from models within our 
minds. Further, we only can interact with the world 
through these mental models. This is an insight 
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generally credited to Craik (1943), although its ori- 
gins can be traced back to Plato. 

Mental models inevitably are incomplete, con- 
stantly evolving, and contain errors (Khella, 2002). 
They usefully can be considered as an ecosystem , a 
term used by Ratey (2002) to describe the brain, 
which, of course, stores and processes our mental 
models. This means that particular models can come 
and go, increase or decrease in accuracy, and con- 
stantly mutate and adapt, both as a result of internal 
processing and in response to external stimuli. 

A person may maintain simultaneously two or 
more compatible mental models of the same subject 
matter, as with the example of the technical drawing 
and the artist’ s sketch used for a motorcycle (Khella, 
2002). Similarly, it is possible for a person to maintain 
simultaneously two or more contradictory mental 
models, a condition known as cognitive dissonance 
(Atherton, 2002). 

In the early 1980s, the idea was established that a 
person may acquire and maintain two basic types of 
mental models for an ICT system: a functional model 
and a structural model. Functional models, also 
referred to as task-action mapping models, are con- 
cerned with how users should interact with the 
system in order to perform the desired tasks and 
achieve their goals. ICT professionals such as usabil- 
ity engineers typically are concerned with this type of 
model. Structural models are concerned more with 
the internal workings and architecture of the system 
and on what principles it operates (i.e., how a system 
achieves its functionality). This type of model is 
generally the concern of ICT professionals such as 
systems architects and coders. Of course, this fits 
well with the idea that an individual may maintain 
simultaneously two or more compatible mental mod- 
els — an informed user of an ICT system may have 
both a good functional and structural mental model of 
the system. 

MENTAL MODELS AND USABILITY 

The arguments for a user possessing a good mental 
model of the ICT system they are using can be 
expressed using the elements of usability defined in 
ISO 9241 -11: 1998: 



• Efficiency: Users with a good mental model 
will be more efficient in their use of an ICT 
system, because they already will understand 
the (optimal) way of achieving tasks; they will 
not have to spend time learning these mecha- 
nisms. 

• Effectiveness: Users will be more effective 
due to their understanding of the system’s 
capabilities and the principles by which these 
capabilities may be accessed. 

• Satisfaction: As a result of increased effi- 
ciency and effectiveness and because users 
can predict more successfully the behavior of 
the system, they also are likely to be more 
satisfied when using the system. 

Conversely, users with a largely incomplete, 
distorted, or inaccurate mental model may experi- 
ence one or more of the following usability prob- 
lems, which again, are categorized using the ele- 
ments of usability defined in ISO 9241-1 1 : 1998: 

• Efficiency: The user may execute functions 
in a (highly) suboptimal way (e.g., not utilizing 
available shortcuts). 

• Effectiveness: The user’s understanding of 
the system will be limited detrimentally in 
scope, so (potentially) useful functionality might 
be hidden from him or her. 

• Satisfaction: The user’s mental model may 
work (to a certain degree) until task complex- 
ity increases or until completely new tasks are 
required. At this stage, catastrophic model 
failure may occur (e.g., a user fails to predict 
the consequences of a particular input to the 
system and gains an outcome very different 
from that expected). Such failure can be quite 
disastrous, leaving users confused and 
demotivated. Indeed, this is one pathology of 
so-called computer rage, a term popular at 
the time of writing. 

These factors explain why the highest level of 
usability occurs when an individual operates a sys- 
tem that they have designed themselves. Here, the 
designer and user’s model of the system can be 
identical (since they belong to the same person). 
This also explains why these user-designed sys- 
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terns very often fail when rolled out to other users 
(Eason, 1988). These systems often are successful 
in the designer’ s hands, but this success is attributed 
to the design of the system itself rather than simply 
being a consequence of the fact that the designer 
inevitably has an excellent mental model of (how to 
use) the system. 

Given this, it seems obvious that a user’s mental 
model of an ICT system benefits from being as 
comprehensive and accurate as possible. This is 
why Norman (1988) included this element in his 
design guidelines for interfaces (summarized in Fig- 
ure 1 by Lienard [2000]). The question then is how 
do we develop and exploit these models to promote 
usability? 

The Intuitive Interface Approach to 
Developing Users’ Mental Models 

Today, many ICT systems are designed with the 
anticipation (or hope) that users will be able to (learn 
how to) operate them within a matter of minutes or 
even seconds. Further, these users often are anony- 
mous to the system designers. This is particularly 
true in the case of pervasive ICT systems such as 
those found on the World Wide Web (WWW). 

Many contemporary practitioners propose that 
quick and easy access to pervasive ICT systems can 
be achieved by designing an intuitive interface. 
This term is widely interpreted to mean that a user 
can operate a system by some innate or even nearly 
supernatural ability (Raskin, 1994). However, this 
notion is ill founded in the HCI domain (Norman, 
1999; Raskin, 1994) and lacks supporting empirical 
evidence (Raskin, 1994). Rather, so-called intuitive 
interfaces simply rely on the fact that users already 
possess mental models that are sufficiently relevant 
to a system such that they can (at least) begin to use 
the system. In other words, the term intuitive simply 
implies familiar (Raskin, 1994). This familiarity is 
often exploited through the use of metaphors, 
whereby a mental model that was developed for use 
in the physical world (e.g., Windows) is leveraged by 
the system designers to aid its use. Important in this 
approach is the idea (or hope) that, from a position 
of (some) familiarity, users then are able to develop 
their understanding of the system by self-exploration 
and self-learning, or self-modeling. 



Norman (1981) hypothesized that if users are left 
to self-model in this way, they always will develop a 
mental model that explains (their perception of) the 
ICT system, and research carried out by Bayman 
and Mayer (1983) supports this hypothesis. Norman 
(1981) also argued that, in these situations, the 
mental models developed by the users are likely to be 
incorrect (e.g., a user simply may miss the fact that 
a metaphor is being used or how the metaphor 
translates into the ICT system (a scenario that is 
likely to result in the usability problems cited earlier). 
Another problem with this approach is that with 
pervasive ICT systems where users are often anony- 
mous, it can be extremely difficult to predict accu- 
rately what mental models already are possessed by 
the target user group. 



c 



The Conceptual Model Approach to 
Developing Users’ Mental Models 



An alternative approach to exploiting mental models 
is to explicitly provide users with a conceptual model 
of the ICT system that accurately reflects its true 
properties. This conceptual model approach was 
advocated in hypothetical terms by Norman (1981), 
Carroll and Olson (1988), and Preece (1994). In 
practice, this approach generally is realized by pre- 
senting the users with suitable schemas or meta- 
phors relevant to the system. In relation to this, 
Norman (1983) offered some revised definitions of 
modeling terminology — he distinguished between 
the tangible conceptual model that is provided to the 
user as an explanation of the system, which might 
use, for example, a story board or ERD and the 
resulting mental model that is formed in the user’s 
mind. This distinction is useful and will be used 
throughout the remainder of this article. 

With the conceptual model approach, it seems 
unlikely that a user’s mental model will overlap 
completely with the conceptual model presented. 
This is because the formation of mental models is a 
highly complex process involving human beings with 
all of their individual abilities, preferences, and idio- 
syncrasies. Further, determining the degree of over- 
lap is somewhat problematic, since it is notoriously 
difficult to elicit and understand the mental model a 
user has of an ICT system and how it is being 
exploited during interaction. Indeed, many studies 
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have attempted to do this and have failed (Preece, 
1994; Sasse, 1992). Given this, it is difficult to prove 
a casual link between the conceptual model ap- 
proach and increased usability. However, a large 
number of studies has demonstrated that, when 
users are explicitly presented with accurate concep- 
tual models, usability can be improved significantly. 
These studies include Foss, et. al. (1982), Bayman 
and Mayer (1983), Halasz and Moran (1983), Kieras 
and Bovair (1984), Borgman (1986), and Frese and 
Albrecht (1988). As an example, Borgman (1986) 
showed how users could better operate a library 
database system after being provided with a concep- 
tual model of the system that utilized a card index 
metaphor, as compared with users in a control group 
who were taught in terms of the operational proce- 
dures required to achieve specific goals and tasks. 
Further, research from Halasz (1984) and Borgman 
(1986) demonstrated that the usability benefits of the 
conceptual model approach increase with task com- 
plexity. 

While these studies demonstrated well the us- 
ability benefits of the conceptual model approach, 
they were limited in that the conceptual models were 
explained through some form of face-to-face teach- 
ing. This presents two interrelated problems within 
the context of modern-day pervasive ICT systems. 
First, this type of education is expensive. Second, the 
user population may be diverse and largely unknown 
to the system vendors. In summary, this sort of face- 
to-face approach is often unviable within this con- 
text. 



FUTURE TRENDS 

Progression in this field might be sought in two 
important ways. First, to address the limitations of 
the conceptual model approach cited previously, it 
would be useful to establish viable means of present- 
ing users with conceptual models when the ICT 
system is pervasive. Second, the opportunity exists 
to develop better conceptual models with which to 
explain ICT systems. 

Online Conceptual Models 

A means of providing conceptual models where the 
ICT system is pervasive is through the use of online 



digital presentations. These might constitute a type 
of mini-lecture about the conceptual model for the 
system, perhaps utilizing (abstractions of) the very 
schemas used to design the system (e.g., storyboards 
and ERDs or suitable metaphors). This is a similar 
idea to online help, except that the user support is 
much more conceptual and self-explanatory in na- 
ture. 

Many organizations produce (online) digital pre- 
sentations to complement their products and ser- 
vices. However, such presentations typically are 
exploited to sell the system rather than to explain its 
use. There are digital presentations that are educa- 
tionally biased. These include a vast amount of 
computer-based training (CBT) resources available 
both online (WWW) and in compact disk (CD) 
format. However, these tend to focus on how par- 
ticular tasks are performed rather than developing a 
deep understanding of the concepts that underpin 
the system. Similarly, there are some good examples 
of how digital presentations have been used to 
communicate concepts (e.g., EDS). However, these 
presentations generally are not directed at using a 
specific ICT system or for use by the typical user. 

Site maps in WWW-based systems are (argu- 
ably) an example of online devices designed to 
convey conceptual understanding. However, while 
their use is now widespread and these devices are 
sometimes useful, they are limited in the scope of 
what they convey and their ability to explain them- 
selves. This is in contrast to an online presentation 
specifically designed to explain a suitable conceptual 
model of a system. 

Better Conceptual Models 

Works related to both of the approaches discussed 
in the previous section (intuitive interface and con- 
ceptual model) share some similarity in that they 
both focus on the user’s knowledge of the system's 
interface and, therefore, the development of func- 
tional mental models. This might be expected in an 
HCI-related field. However, over reliance on (just) 
functional mental models inevitably limits the user’ s 
understanding of the system, and it can be argued 
that this consequentially limits usability. 

Some researchers (e.g., Preece, 1994) have 
hypothesised that users might benefit greatly from 
(also) acquiring structural mental models of the 
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systems they use. Such models may better help the 
users to anticipate the behavior of the system, 
particularly in new contexts of use and when the 
system is being used nearly at its performance limits. 
Indeed, the benefit of having a good structural 
mental model helps to explain why the user-designed 
ICT systems discussed earlier are so usable — the 
designer of a system inevitably has a good structural 
mental model. The problem with this approach is that 
simply inferring structural models through self-mod- 
eling is extremely difficult and likely to fail (Miyake, 
1986; Preece, 1994). However, it may be possible to 
develop useful structural mental models in users by 
providing them with appropriate conceptual models. 

Further, if users are provided with both structural 
and functional conceptual models with which to form 
their mental model, triangulation can take place — 
a user is able to think critically about whether the 
functional and structural models are complementary 
or are the source of cognitive dissonance; in which 
case users may seek greater clarity in their under- 
standing. Indeed, such use of triangulation is an 
established principle of understanding any subject 
matter (Weinstein, 1995). 



CONCLUSION 

The merit in users having a good mental model of an 
ICT system would seem to be universally recognized 
and has been inferred by many research studies. 
Some professionals in this field might argue that 
progression has slowed since the 1980s and that the 
arguments and conclusions presented here might 
make a useful agenda for future research and 
consultancy. Specifically, we might proceed by en- 
couraging users to develop more structural mental 
models of ICT systems. Similarly, presenting con- 
ceptual models using online digital presentations 
may be of key importance in improving the usability 
of pervasive ICT systems. 
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KEY TERMS 

Cognitive Dissonance: The situation where a 
person simultaneously holds two contradictory mod- 
els of the same subject matter (e.g., an ICT system). 

Conceptual Model: A model concerned with 
key, or fundamental, properties of the system. Typi- 
cally concerned with the rationale and scope of a 
system, what the system is designed to do, the basic 
principles on which the system operates. Also, the 
basic principles utilized in operating the system. 
Alternatively, the model specifically offered to a 
user, which explains these ideas (Norman, 1983). 

Functional Model: A model concerned prima- 
rily with how to interact with a system, how it is 
operated. 

Mental Model: A model stored and processed 
in a person’s mind. 

Model: A simplified abstraction that shows prop- 
erties of some subject matter relevant to a particular 
purpose, context, or perspective. 

Self-Modeling: The process whereby an un- 
aided user develops his or her own mental model of 
a system to explain its behavior, achieved through 
exploration or trial and error learning. 

Structural Model: A model concerned prima- 
rily with the internal workings and architecture of 
the system and on what principles it operates and 
how a system achieves its functionality. 

Task-Action Mapping Model: Synonym for 
functional model. 

Triangulation: When a subject is viewed from 
more than one perspective during the learning or 
perceptual process. 

Usability: Defined in ISO 9241-1 1 : 1998 as hav- 
ing three elements, as summarized by Maguire ( 1998) : 
effectiveness — how well users achieve the goals 
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they set out to achieve using the system; effi- 
ciency — the resources consumed in order to achieve 
their goals; and satisfaction — how users feel about 
their use of the system. 
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INTRODUCTION 

Through pervasive computing, users can access 
information and applications anytime, anywhere, 
using any device. But as mobile devices such as 
Personal Digital Assistant (PDA), SmartPhone, and 
consumer appliance continue to flourish, it becomes 
a significant challenge to provide more tailored and 
adaptable services for this diverse group. To make 
it easier for people to use mobile devices effectively, 
there exist many hurdles to be crossed. Among them 
is small display size, which is always a challenge. 

Usually, applications and documents are mainly 
designed with desktop computers in mind. When 
browsing through mobile devices with small display 
areas, users’ experiences will be greatly degraded 
(e.g., users have to continually scroll through a 
document to browse). However, as users acquire or 
gain access to an increasingly diverse range of 
portable devices (Coles, Deliot, & Melamed, 2003), 
the changes of the display area should not be limited 
to a single device any more, but extended to the 
display areas on all available devices. 

As can be readily seen from practice, the sim- 
plest multi-device scenario is when a user begins an 
interaction on a first access device, then ceases to 
use the first device and completes the interaction 
using another access device. This simple scenario 
illustrates a general concern about a multi-device 
browsing framework: the second device should be 



able to work cooperatively to help users finish 
browsing tasks. 

In this article, we propose a cooperative frame- 
work to facilitate information browsing among de- 
vices in mobile environment. We set out to over- 
come the display constraint in a single device by 
utilizing the cooperation of multiple displays. Such a 
novel scheme is characterized as: (1) establishing a 
communication mechanism to maintain cooperative 
browsing across devices; and (2) designing a dis- 
tributed user interface across devices to coopera- 
tively present information and overcome the small 
display area limited by a single device. 

BACKGROUND 

To allow easy browsing of information on small 
devices, there is a need to develop efficient methods 
to support users. The problems that occur in infor- 
mation browsing on the small-form-factor devices 
include two aspects : ( 1 ) ho w to facilitate information 
browsing on small display areas; and (2) how to help 
user’s access similar information on various de- 
vices. 

For the first case, many methods have been 
proposed for adapting various media on small display 
areas. In Liu, Xie, Ma, and Zhang (2003), the author 
proposed to decompose an image into a set of 
spatial-temporal information elements and generate 
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an automatic image browsing path to display every 
image element serially for a brief period of time. In 
Chen, Ma, and Zhang (2003), a novel approach is 
devised to adapt large Web pages for tailored display 
on mobile device, where a page is organized into a 
two-level hierarchy with a thumbnail representation 
at the top level for providing a global view and index 
to a set of sub-pages at the bottom level for detail 
information. However, these methods have not con- 
sidered utilizing multiple display areas in various 
devices to help information browsing. 

For the second case, there exist a number of 
studies to search relevant information for various 
media. The traditional image retrieval techniques 
are mainly based on content analysis, such as those 
content-based image retrieval (CBIR) systems. In 
Dumais, Cutrell, Cadiz, Jancke, Sarin, and Robbins 
(2003), a desktop search tool called Stuff I’ve Seen 
(SIS) was developed to search desktop information 
including email, Web page, and documents (e.g., 
PDF, PS, MSDOC, etc.). However, these ap- 
proaches have not yet taken into account the phase 
of information distribution in various devices. What’ s 
more, user interface needs further consideration 
such as to facilitate user’s access to the information 
that distributes in various devices. 

In this article, we propose a cooperative frame- 
work to facilitate user’s information browsing in 
mobile environment. The details are to be discussed 
in the following sections. 



OUR FRAMEWORK 

Uniting Multiple Displays Together 

Traditionally, the design of user interface for appli- 
cations or documents mainly focus on desktop com- 
puters, which are commonly too large to display on 
small display areas of mobile devices. As a result, 
readability is greatly reduced, and users’ interac- 
tions are heavily augmented such as continual scroll- 
ing and zooming. 

However, as users acquire or gain access to an 
increasingly diverse range of the portable devices, 
the thing changes; the display area will not be limited 
to a single device any more, but extended to display 
areas on all available devices. According to existing 



studies, the user interface of future applications will 
exploit multiple coordinated modalities in contrast to 
today’s uncoordinated interfaces (Coles et al., 2003). 
The exact combination of modalities will seamlessly 
and continually adapt to the user’s context and 
preferences. This will enable greater mobility, a 
richer user experience of the Web application, and a 
more flexible user interface. In this article, we focus 
on overcoming display constraints rather than other 
small form factors (Ma, Bedner, Chang, Kuchinsky, 
& Zhang, 2000) on mobile devices. 

The Ambient Intelligence technologies provide a 
vision for creating electronic environments sensitive 
and responsive to people. Brad (2001) proposed to 
unite desktop PCs and PDAs together, in which a 
PDA acts as a remote controller or an assistant input 
device for the desktop PC. They focused on the shift 
usage of mobile devices mainly like PDAs as ex- 
tended controllers or peripheries according to their 
mobility and portability. However, it cannot work for 
many cases such as people on the move without 
access to desktop computers. 

Though multiple displays are available for users, 
there still exist many tangles to make multiple de- 
vices work cooperatively to improve the user’s 
experience of information browsing in mobile de- 
vices. In our framework, we design a distributed 
interface that crosses devices to cooperatively 
present information to mobile users. We believe our 
work will benefit users’ browsing and accessing of 
the available information on mobile devices with 
small display areas. 



c 



Communication Protocol 



The rapid growth of wireless connection technolo- 
gies, such as 802.11b or Bluetooth, has enabled 
mobile devices to stay connected online easily. We 
propose a communication protocol to maintain the 
cooperative browsing with multiple devices. When a 
user manipulates information in one device, our task 
is to let other devices work cooperatively. To better 
illustrate the communication, we introduce two nota- 
tions as follows: (1 ) Master device is defined as the 
device that is currently operated on or manipulated 
by a user; and (2) Slave device refers to the device 
that displays cooperatively according to user’ s inter- 
actions with a master device. 
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We define a whole set of devices available for 
users as a cooperative group. A rule is regulated that 
there is only one master device in a cooperative group 
at a time, and other devices in the group act as slave 
devices. We call a course of cooperative browsing 
with multiple devices as a cooperative session. In 
such a session, we formulate the communication 
protocol as follows: 

• A user selects a device to manipulate or access 
information. The device is automatically set as 
the master device in the group. A cooperative 
request is then multicast to the slave devices. 

• The other devices receive the cooperative re- 
quest and begin to act as slave devices. 

• When the users manipulate the information on 
the master device, the features are automati- 
cally extracted according to the analysis of 
interactions, and are then transferred to slave 
devices. 

• According to the received features, the corre- 
sponding cooperative display updates are auto- 
matically applied on the slave devices. 

• When a user quits the manipulation of informa- 
tion in the master device, a cooperative termi- 
nation request is multicast to the slave devices 
to end the current cooperative session. 

Two-Level Browsing Scheme 

We set out to construct distributed user interfaces by 
uniting the multiple display areas on various devices 
to overcome the display constraint in a single device. 
In our framework, we propose a two-level coopera- 
tive browsing scheme, namely within-document and 
between-document. If a document itself needs to be 
cooperatively browsed across devices, we define 
this case as within-document browsing. Otherwise, if 
a relevant document needs to be cooperatively 
browsed across devices, we consider this case as 
between-document browsing. 

1 : Within-Document Cooperative Browsing 

There exist many studies to improve the browsing 
experiences on small screens. Some studies pro- 
posed to render a thumbnail representation (Su, 
Sakane, Tsukamoto, & Nishio, 2002) on mobile de- 



vices. Though users can browse an overview 
through such a display style, they still have to use 
panning/zooming operations for a further view. 
However, users’ experiences have not been im- 
proved yet since these operations are difficult to be 
finished in a thumbnail representation. 

We propose a within-document cooperative strat- 
egy to solve this problem, where we develop a so- 
called two-level representation for a large docu- 
ment: (1) presenting an index view on the top level 
with each index pointing to detailed content portion 
of a document in the master device; and (2) a click 
in each index leads to automatic display updates of 
the corresponding detailed content in the slave 
devices. 

We believe such an approach can help users 
browse documents on small devices. For example, 
users can easily access the interesting content 
portions without scrolling operations but a click on 
the index view. 

2: Between-Document Cooperative 
Browsing 

As shown in previous studies (Hua, Xie, Lu, & Ma, 
2004, 2005; Nadamoto & Tanaka, 2003), users tend 
to view similar documents (e.g., image and Web 
page) concurrently for a comparative view of their 
contents. User’s experience will be especially de- 
graded in such scenarios due to two reasons. Firstly, 
it’s difficult for users to seek out relevant informa- 
tion on a mobile device, and the task becomes more 
tedious with the increase of the number of devices 
for finding. Secondly, it’s not feasible to present 
multiple documents simultaneously on a small dis- 
play area, and it’s also tedious for users to switch 
through documents for a comparative view. 

In our system, we propose a between-document 
cooperative mechanism to address this problem. 
Our approach comprises of two steps: (1) relevant 
documents are automatically searched based on the 
information a user is currently focusing on the 
master device; and (2) the searched documents are 
presented on the slave devices. Therefore, this 
method can facilitate users’ comparative view with- 
out manual efforts. Thus, users can easily achieve 
a comparative view with a simple glimpse through 
devices. 
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APPLICATION OF OUR 
COOPERATIVE FRAMEWORK 

To assess the effectiveness of our framework, we 
apply it to several types of documents that are 
ubiquitous in mobile devices, including images, text 
documents (such as e-mail, PDF file, MS documents 
like DOC or PPT files) and Web pages. In the 
following sections, we illustrate each in detail. 



develop a mechanism to facilitate the speaker’s 
interaction with the slides when he or she moves 
around. We present an indexed view on small de- 
vices like SmartPhone, which can be taken by users, 
and the interaction with this phone generates the 
display updates on the screen. 



c 



2: Between-Document Cooperative 
Browsing 



Cooperative Browsing of Documents 

1 : Within-Document Cooperative Browsing 

Document readability is greatly degraded due to the 
small display areas on current mobile devices; users 
have to continually scroll through the content to 
browse each portion in detail. In this case, we 
believe our between-document solution is capable of 
solving this problem: ( 1 ) partitioning a document into 
a series of content sections according to paragraph 
or passage information; (2) extracting a summary 
description from each portion using a title or sub-title 
(summary instead if no titles); and (3) generating an 
index view for the document where each index 
points to the relevant detailed content portion in a 
large document. 

Figure 1 shows an example of our solution, where 
an MSWord document is represented through its 
outline index, and a click leads to the display of 
detailed content in its slave devices. The design for 
slides is really useful in practice. For instance, for a 
speaker who often moves around to keep close 
contact with his/her audiences, it’s necessary to 



In our multi-device solution, we search relevant 
documents automatically and then deliver them to 
the slave devices to help browsing. We automati- 
cally identify the passages that are currently dis- 
played on the center screen in the master device as 
the indicative text to find out relevant information. 
As has been pointed out by many existing studies, it 
is sometimes better to apply retrieval algorithms to 
portions of a document text than to all of the text 
(Stanfill & Waltz, 1992). The solution adopted by our 
system was to create new passages of every appro- 
priate fixed length words (e.g., 200 words). The 
system searches a similar document from each 
device by using the passage-level feature vectors 
with keywords. The similarities of passages are 
computed using the Cosine distance between the 
keyword feature vectors (e.g., TFIDF vector model). 
In this way, our system searches for similar pas- 
sages in the available document set, and the docu- 
ment with the greatest number of similar paragraphs 
becomes the similar page. 

Figure 2 shows an example of this case, where 
(b) is the search results by our approach according 
to the content information that is extracted from (a). 



Figure 1. An example for within-document 
cooperative browsing 



Figure 2. An example for between-document 
cooperative browsing 




123 




A Cooperative Framework for Information Browsing in Mobile Environment 



Furthermore, our system automatically scrolls the 
relevant document to the similar passages that hold 
the maximal similarity. 

Cooperative Browsing of Web Pages 

With the pervasive wireless connection in mobile 
devices, users can easily access the Web. However, 
Web pages are mostly designed for desktop comput- 
ers, and the small display areas in mobile devices are 
consequently too small to display them. Here, we 
apply our framework to employ a cooperative way to 
generate a tailored view of large Web pages on 
mobile devices. 

1 : Within-Page Cooperative Browsing 

Different from documents (e.g., MS Word), Web 
pages include more structured contents. There exist 
a lot of studies on page segmentation to partition 
Web pages into a set of tailored blocks. Here, we 
adopt the methods by Chen, Xie, Fan, Ma, Zhang, 
and Zhou (2003), in which each page is represented 
with an indexed thumbnail with multiple segments 
and each of them points to a detailed content unit. 
Figure 3 shows an example of this case. In our 
system, we deliver the detailed content blocks to 
various devices. Additionally, each detailed content 
block is displayed on a most suitable display area. 

2: Between-Page Cooperative Browsing 

Besides improving page readability in small display 
areas of mobile devices, users also need to browse 



relevant pages that contain similar information. In 
common scenarios, users need to manually search 
these relevant pages through a search engine or 
check-through related sites. Here, we develop an 
automatic approach to present relevant Web pages 
through a cross-device representation. 

Our method to find out similar pages comprises 
three steps: (1) extracting all links from the page, 
which are assumed to be potential candidates that 
contain relevant information; (2) automatically down- 
loading content information for each extracted links, 
and representing each document as a term-fre- 
quency feature vector; and (3) comparing the simi- 
larity of extracted pages and current page based on 
the Cosine distance through the feature vector. 
Thus, the page with the maximal similarity is se- 
lected as the relevant one, and its URL is sent to 
other devices for an automatic display update. An 
example is shown in Figure 4. 

Cooperative Browsing of Images 

Lpictures are not fitful for the display on mobile 
devices with small display areas. Here we apply our 
framework to facilitate users’ image browsing 
through a cooperative cross-device representation. 

1 : Within-lmage Cooperative Browsing 

In addition to all previous automatic image browsing 
approaches in a single small device (Chen, Ma, & 
Zhang, 2003; Liu, Xie, Ma, & Zhang, 2003), we 
provide in our approach a so-called smart navigation 
mode (Hua, Xie, Lu, & Ma, 2004). In our approach, 



Figure 3. An example for within-page cooperative Figure 4. An example for between-page 
browsing cooperative browsing 




124 




A Cooperative Framework for Information Browsing in Mobile Environment 



Figure 5. An example for within-image 
cooperative browsing 



Figure 6. The synchronization between PDA1 
and PDA2 



c 




Phone 1 Phone2 





Phone2 



an image is decomposed into a set of attention 
objects according to Ma and Zhang (2003) and 
Chen, Ma, and Zhang (2003), and each is assumed 
to contain attentive information in an image. Switch- 
ing objects in a master device will result in a detailed 
rendering of the corresponding attention object on 
the slave device (e.g., Figure 5). 

2: Between-lmage Cooperative Browsing 

In our previous work (Flua, Xie, Lu, & Ma, 2004), we 
proposed a synchronized approach called ASAP to 
facilitate photo viewing across multiple devices, 
which can simultaneously present similar photos on 
various devices. A user can interact with either of 
the available devices, and the user’s interaction can 
automatically generate the synchronized updates in 
other devices. In Figure 6, there are two PDAs 
denoted PDA1 and PDA2, and each stores a num- 
ber of pictures. When a user clicks a photo in PDA1 
(Figure 6a), there are two steps to be done simulta- 
neously: (1) PDA1 searches out similar images and 
displays them (b); (2) PDA1 sends image feature to 
PDA2, which then search out the similar photos (c). 



FUTURE TRENDS 

Now, we are planning to improve our work in three 
aspects. First, we will develop more accurate algo- 
rithms to search out relevant information of various 




PDAl 



PDA2 



media types including image, text and Web page. 
Second, we plan to devise more advanced distrib- 
uted interfaces to facilitate users’ information brows- 
ing tasks across various devices. Third, we plan to 
apply our work to other applications such as more 
intricate formats of documents and execution appli- 
cation GUIs. We also plan to conduct a usability 
evaluation among a wide number of users to collect 
their feedbacks, which will help us find the points 
they appreciate and the points that need further 
improvements. 



CONCLUSION 

In this article, we developed a cooperative frame- 
work that utilizes multiple displays to facilitate infor- 
mation browsing in mobile environment. A two-level 
browsing scheme is employed in our approach, 
namely within- and between- document browsing. 
We apply our framework to a wide variety of 
applications including documents, Web pages and 
images. 
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KEY TERMS 

Ambient Intelligence: Represents a vision of 
the future where people will be surrounded by 
electronic environments that are sensitive and re- 
sponsive to people. 

ASAP System: The abbreviation of a synchro- 
nous approach for photo sharing across devices to 
facilitate photo viewing across multiple devices, 
which can simultaneously present similar photos 
across multiple devices at the same time for com- 
parative viewing or searching. 

Attention Object: An information carrier that 
delivers the author’s intention and catches part of the 
user’s attention as a whole. An attention object often 
represents a semantic object, such as a human face, 
a flower, a mobile car, a text sentence, and so forth. 

Desktop Search: The functionality to index and 
retrieve personal information that is stored in desk- 
top computers, including files, e-mails, Web pages 
and so on. 
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Small Form Factors: Mobile devices are de- 
signed for portability and mobility, so the physical 
size is limited actually. This phase is called small 
form factors. 



inverse document frequency is that terms that ap- 
pear in many documents are not very useful for 
distinguishing a relevant document from a non- 
relevant one. 



c 



TFIDF Vector Model: TF is the raw frequency 
of a given term inside a document, which provides 
one measure of how well that term describes the 
document contents. DF is the number of documents 
in which a term appears. The motivation for using an 



Visual Attention: Attention is a neurobiological 
conception. It implies the concentration of mental 
powers upon an object by close or careful observing 
or listening, which is the ability or power to concen- 
trate mentally. 
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INTRODUCTION 

Cooperation with computers, or Computer Sup- 
ported Cooperative Work (CSCW), started in the 
1990s with the growth of computers connected to 
faster networks. 

Cooperation and Coordination 

CSCW is a multidisciplinary domain that includes 
skills and projects from human sciences (sociology, 
human group theories, and psychology), cognitive 
sciences (distributed artificial intelligence), and com- 
puter science (human/computer interfaces; distrib- 
uted systems; networking; and, recently, multime- 
dia). 

The main goal of the CSCW domain is to support 
group work through the use of networked computers 
(Ellis etal., 1991; Kraemer et al., 1988). CSCW can 
be considered a specialization of the Human-Com- 
puter Interaction (HCI) domain in the sense that it 
studies interactions of groups of people through 
distributed groups of computers. 

Two main classes can be defined within the 
CSCW systems. Asynchronous Cooperations do not 
require the co-presence of all the group members at 
the same time. People are interacting through asyn- 
chronous media like e-mail messages on top of 
extended and improved message systems. At the 
opposite, Synchronous Cooperations create stron- 
ger group awareness, because systems supporting 
them require the co-presence of all the group mem- 
bers at the same time. Exchanges among group 
members are interactive, and nowadays, most of 
them are made with live media (audio- and 
videoconferences). 



Groupware (Karsenty, 1994) is the software and 
technological part of CSCW. The use of multimedia 
technologies leads to the design of new advanced 
groupware tools and platforms (Williams et al., 
1994), such as shared spaces (VNC, 2004), elec- 
tronic boards (Ellis et al. , 199 1 ), distributed pointers 
(Williams et al., 1994), and so forth. The major 
challenge is the building of integrated systems that 
can support the current interactions among group 
members in a distributed way. 

Coordination deals with enabling and controlling 
cooperation among a group of human or software 
distributed agents performing a common work. The 
main categories of coordination services that can be 
distinguished are dynamic architecture and compo- 
nents management; shared workspace access and 
management; multi-site synchronization; and 
concurrency, roles, and group activity management. 

Related Projects 

Many researches and developments for distance 
learning are made within the framework of the more 
general CSCW domain. 

Some projects, such as Multipoint Multimedia 
Conference System (MMCS) (Liu et al., 1996) and 
Ground Wide Tele-Tutoring System (GWTTs) 
(GWTTS project, 1996), present the use of video 
communications with videoconference systems, com- 
munication boards, and shared spaces, built on top of 
multipoint communication services. The Distance 
Education and tutoring in heterogeneous teleMatics 
environments (DEMOS) project (Demos project, 
1997) uses common public shared spaces to share 
and to control remotely any Microsoft Windows 
application. The MultiTeam project (MultiTeam 
project, 1996) is a Norwegian project to link distrib- 
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uted classrooms over an ATM network through a 
giant electronic whiteboard the size of an ordinary 
blackboard. Microsoft NetMeeting (Netmeeting, 
2001), together with Intel Proshare (now Polycom 
company) (Picture Tel, 2004), are very popular and 
common synchronous CSCW toolkits based on the 
H.320 and H.323 standards and both composed of a 
shared electronic board, a videoconference, and an 
application sharing space. These tools are used most 
of the time on top of the classical Internet that limits 
their efficiency due to its not guaranteed quality of 
service and its irregular rate. Most of their ex- 
changes are based on short events made with TCP/ 
IP protocol in peer-to-peer relationships. The 
ClassPoint (ClassPoint, 2004) environment has been 
created by the First Virtual Communications society 
(formerly White Pine society). It is composed of 
three tools: a videoconference based on See You 
See Me for direct video contacts among the distrib- 
uted group members, the dialogs, and views of the 
students being under the control of the teacher. An 
electronic whiteboard reproduces the interactions 
made by classroom blackboards. A Web browser has 
been customized by a synchronous browsing function 
led by the teacher and viewed by the whole-distrib- 
uted class. This synchronous browsing can be relaxed 
by the teacher to allow freer student navigation. 



BACKGROUND 

A Structuring Model for Synchronous 
Cooperative Systems 

Different requirements are identified for the design 
of networked synchronous CSCW systems. Such 



systems may be designed to improve the efficiency 
of the group working process by high-quality multi- 
media material. The networked system must support 
both small- and large-scale deployments, allowing 
reduced or universal access (Demos project, 1997). 
Defining the requirements of a networked synchro- 
nous CSCW system needs multidisciplinary exper- 
tise and collaboration. For this purpose, we distin- 
guish three distinct viewpoints: functional, architec- 
tural, and technological. 

Moreover, several objectives may be targeted by 
the networked solution retained for the synchronous 
CSCW system, including adaptability, upgradability, 
multi-user collaboration, and interaction support. In 
practice, the design and development of a networked 
solution first involve general skills such as software 
architecture, knowledge organization, and other work 
resources management; and the second develop- 
ment of domain-specific multimedia cooperation 
tools. We identify three generic interaction levels 
that are likely to be significant for the different 
viewpoints: the cooperation level, the coordination 
level, and the communication level. Their content is 
summarized in T able 1 . 

For software architecture design, level-based 
layering allows different technologies to be used for 
implementing, integrating, and distributing the soft- 
ware components. This separation increases the 
upgradability of the systems. Layering allows the 
implemented system to likely guarantee the end-user 
quality of service while taking advantage of the 
different access facilities. For adaptability, multi- 
user collaboration, and interaction support, level- 
based decomposition allows functional separation 
between individual behaviors and group interaction 
rules definition. 



c 



Table 1. The three levels and three viewpoints of the structuring model 



Interaction Levels/ 
Viewpoints 


Cooperation 


Coordination 


Communication 


Functional view 


User-to-user interaction 
paradigms 


User-level group 
coordination functions 
(sharing, and awareness) 


User-to-user information 
exchange conventions 


Architectural view 


Cooperation tools 


software-level group 
coordination services (for 
tools and components) 


Group communication 
protocols (multipeer 
protocols) 


Technological view 


Individual tool 
implementation 
technology (interfacing, 
and processing) 


Components integration 
technology 


Component distribution 
technology (request 
transport protocols) 
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From a functional viewpoint, the cooperation level 
describes the interaction paradigms that underlay the 
member interactions during the group working pro- 
cess. From an architectural viewpoint, it defines the 
tool set covering the above paradigms. From a tech- 
nological viewpoint, the cooperation level describes 
the different technologies that are used to implement 
the individual tools, including interfacing and pro- 
cessing. 

According to the functional viewpoint, the coordi- 
nation level lists the functions that are used to man- 
age the user collaboration, providing document shar- 
ing and group awareness. From an architectural 
viewpoint, it describes the underlying user services, 
as tools group coordination, membership manage- 
ment, and component activation/deactivation. From a 
technological viewpoint, the coordination level de- 
fines the integration technology used to make compo- 
nents interact, including interaction among peers 
(different components of the same tool), tools, and 
users. 

From a functional viewpoint, the communication 
level describes the conventions that manage the 
information exchange between users. From an archi- 
tectural viewpoint, it enumerates multicast protocols, 
allowing user-to-user or tool-to-tool group communi- 
cation. From a technological viewpoint, the commu- 
nication level defines the protocols used to handle 
groups and to exchange requests between compo- 
nents, including tools and services. 

In the sequel of this article, this multi-view/level 
model for cooperative applications has been applied 
for the analysis, comparison, and development of a 
group-based networked synchronous platform. The 
application domain of this platform relates to Distrib- 
uted System Engineering. 

DISTRIBUTED SYSTEM 
ENGINEERING (DSE) 

The Distributed System Engineering (DSE) Euro- 
pean Project — contract IST-1999-10302 — started in 
January 2000 and ended in January 2002 (DSE 
project, 2002). It was conducted by an international 
consortium, whose following members have strong 
involvement in the space business: Alenia Spazio 
(ALS), coordinator of the project; EADS Launch 
Vehicles (ELV); and IndustrieAnlagen 



BetriebsGesellshaft (IABG). The other members 
of the DSE project belong to technology providers 
and research centers: Silogic, Societa Italiana 
Avionica (SIA), University of Paris VI (LIP6), 
LAAS-CNRS, and D3 Group. 

Research Objectives 

The Engineering Life Cycle for complex systems 
design and development, where partners are dis- 
persed in different locations, requires the setup of 
adequate and controlled processes involving many 
different disciplines (Drira et al., 2001; Martelli et 
al„ 2000). 

The design integration and the final system physi- 
cal/functional integration and qualification imply a 
high degree of cross-interaction among the part- 
ners. The in-place technical information systems 
supporting the life cycle activities are specialized 
with respect to the needs of each actor in the 
process chain and are highly heterogeneous among 
them. 

To globally innovate in-place processes, involved 
specialists will be able to work as a unique team in 
a Virtual Enterprise model. To this aim, it is neces- 
sary to make interoperable the different Technical 
Information Systems and to define Cooperative 
Engineering Processes that take into account dis- 
tributed roles, shared activities, and distributed pro- 
cess controls. DSE is an innovative study in this 
frame. It addressed this process with the goal of 
identifying that proper solutions (in terms of design, 
implementation, and deployment) have been car- 
ried out. 

DSE Software Platform 

The software platform that has been realized to 
support Cooperative Engineering scenarios is com- 
posed of several distributed subsystems, including 
Commercial Off-The-Shelf (COTS) components 
and specifically developed components. 

Session Management 

The Responsibility and Session Management sub- 
system is the central component (Figure 1) devel- 
oped at LAAS-CNRS (Molina-Espinosa, 2003). It 
is in charge of session preparation and scheduling. 
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Figure 1. Responsibility and session management 
GUI 




It supports the definition and programming of user 
profiles, roles, and sessions characteristics, using a 
remote responsibilities and schedule repository. Both 
synchronous and asynchronous notification medi- 
ums are provided by this environment for trouble- 
shooting, tuning, and negotiation during the prepara- 
tion and enactment of the collaboration sessions. Its 
Graphical User Interface (GUI) is implemented with 
Java Swing. 

DSE Architecture 

The DSE global collaboration space integrates (i) 
multipoint communication facilities (Multicast Con- 
trol Unit [MCU] — the MeeetingPoint server); (ii) 
several Groupware tools (videoconferences: CU- 
See-Mee or Netmeeting — document sharing tools: 
E-VNC); and (iii) domain-specific tools (Computer 
Aided Design [CAD] tools or System Engineering 
tools). 

The DSE architecture is composed of several 
clients and servers. The link between clients and 
servers is made through high-level communication 
facilities based on CORBA, JAVA-RMI, and XML 
messages. 

The Main DSE server contains an Enterprise 
Data Repository (EDR) to store documents handled 
by the session members and an HTTP server as 
unique front-end. 



The EDR is implemented using the following 
technologies: 



c 



• ORACLE Portal 3.0 for EDR features 

• ORACLE database 8i release 2 



Communications are implemented with: 

• DSE WEB Portal (HTML page foreseen to be 
generated and managed by ORACLE Portal 
3.0) 

• Servlet Engine: Jserv (Apache’s JSP/Servlet 
Engine) 

The DSE front-end provides a unique entry point 
to all DSE users. It allows transparent access to all 
the DSE subsystems described in the sequel. The 
chosen HTTP server is: 



• Apache server included with ORACLE Portal 
3.0 



The DSE Awareness Subsystem is a notification 
service. It is composed of a notification server based 
on commercial CORBA 2.3-compliant ORB Ser- 
vice. 

The DSE Responsibility and Session (RMS and 
SMS) management sub-subsystem provides com- 
munication facilities for the managing and activating 
sessions. It is composed of an HTTP server for 
using servlets and a JAVA middleware that provides 
RMI communications. 

A typical configuration of a DSE client contains: 

(i) Groupware tools 

H.323 videoconference client: MS NetMeeting 
or CuSeeMe 

Application sharing client: Extended- VNC pack- 
age (E-VNC) (Client, Proxy with chairman 
GUI, Server). 

(ii) Generic communication tools 

Web Browser (Internet Explorer, Netscape, or 
Mozilla) 

Mail client (NT Messenger, Outlook Express, 
or Eudora) 

(iii) Management interfaces 

Session and Responsibility Management GUI 
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(iv) Domain-specific interfaces 

A graphical client component for distributed 
simulation based on High Level Architecture 
(HLA) 

Remark: Some light DSE clients are only com- 
municating with Web-based protocols and are not 
using CORBA services. 

A Preliminary Design Review (PDR) server is 
added for the domain-specific application. The PDR 
validation scenario will be presented in a next sec- 
tion. 

DSE Deployment Architecture 

Figure 2 shows the architecture of the DSE platform 
with clients and servers distributed among the in- 
volved partners. The network connections used by 
the partners during the Integration and Validation 
phases are listed below: 

• ALS (Turin) : ISDN 2xBRI with IABG, Public 

Internet 



• ELV (Paris): ATM 2Mbps to/from LIP6, Pub- 
lic Internet 

• IABG (Munich): ISDN 2xBRI with ALS, 
Public Internet 

• SILOGIC (Toulouse): Public Internet 

• D3 (Berlin): Public Internet 

• SIA (Turin): Public Internet, High Speed local 
connection with ALS 

• LAAS (Toulouse): ATM to/from LIP6, Pub- 
lic Internet 

• LIP6 (Paris): ATM to/from LAAS and ELV, 
Public Internet 

Machines used for validation were running Win- 
dows NT 4.0. However, the light DSE clients also 
can run on PCs with Windows 2000 or on Solaris 2.x 
SUN workstations. 

Validation Scenario 

The context of collaborative design and analysis 
scenarios is based on real design material from the 
flight segment development program of the Auto- 
mated Transfer Vehicle (ATV). The ATV missions 
are to: 



Figure 2. The global DSE servers deployment architecture 
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• Deliver freight to the International Space Sta- 
tion (ISS) 

• Reboost ISS 

• Collect ISS garbage and destroy them 



the customers with the support of prime contractors. 
They decide to implement or reject the review 
committee’s recommendations. Then, they decide 
to begin the next program phase or not. 



c 



Preliminary Design Review 



Distributed PDR Scenario 



The precise DSE validation scenario is called Pre- 
liminary Design Review (PDR). The PDR process 
is a very codified design phase for which a set of 
contractors has to submit a large set of documenta- 
tion to the customer review team. The PDR Review 
Objectives can be detailed in: 

• Obtaining an independent view over program 
achievements 

• Identifying difficulties and major risks in order 
to reduce them 

• Assessing the progress report 

• Putting documents and product in a validated 
baseline 

• Supporting the management in deciding to con- 
tinue or not 



The main applicative goal of the DSE project is to 
realize the PDR process in a distributed way. The 
three phases of the PDR review (prepare the re- 
view, execute the review, and process the review 
results) were supported through the set of distrib- 
uted networked tools and components that has been 
developed and integrated. Using this set of compo- 
nents, the session manager was able to organize the 
review. This included definitions of group members, 
tasks, roles, and progress level of the review. The 
second offered service was the management of the 
data issued during the PDR activity, the Review 
Items Discrepancies (RID). Using the EDR data- 
base from the main DSE server, the RID documents 
were created remotely, accessed, shared, and modi- 
fied during the review process. 



The PDR actors form a large group composed of 
review committees, project team interfaces, and 
review boards. The review committees contain 50 to 
100 experts split into five to 10 thematic groups. 
They are controlled by a review chairman and a 
secretary. They examine documents and make Re- 
view Item Discrepancy (RID), which is a kind of 
codified comments. The project team interfaces are 
composed of eight to 12 project team members. 
They answer to committee comments and board 
recommendations. The review board is composed of 



MODEL-BASED OVERVIEW OF THE 
DSE ENVIRONMENT 

Table 2 represents the design model instance de- 
scribing the DSE environment. The functional view 
directly refers to the DSE PDR scenario. The 
architectural view represents all the tools and com- 
ponents used to support the PDR scenarios. The 



Table 2. An overview of the DSE model 



Interaction Levels/ 
Viewpoints 


Cooperation 


Coordination 


Communication 


Functional view 


- Actions on Review Item 
Discrepancies (RIDs) 


- PDR role attribution 
(Review Committee, Project 
Team, Review Board) 


- PDR role based: 

- Comment/Answer 

- Commit/Reject 


Architectural view 


- Videoconference 

- Application Sharing Tool 

- Text Chat 


- Session management 
services (definition and 
enactment) 

- Multi-user coordination 


- Audio/Video multicast 
protocol 

- IP multicast 

- Reliable multicast 


Technological view 


- H.263 video cards 

- JAVA AWT/SWING 


- JAVA applets 

- JAVA/CORBA objects 

- XML Canvas 


- JAVA RMI 

- CORBA HOP protocol 

- WWW/HTTP protocol 
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technological view focuses on infrastructure, lan- 
guages, and protocols used for the development and 
the integration of the global environment. 

FUTURE TRENDS 

Future trends will be to apply the proposed design 
framework in different application contexts where 
CSCW technology also should be useful. The next 
application domain considered is distance learning. 

We are now involved in a European 1ST project 
called Lab@Future, which started in May 2002 and 
will last for three years. It aims to define a generic 
and universal platform for mixed and augmented 
reality. This platform must be able to support inter- 
actions through mobile devices and wireless net- 
works. 

It will be applied in the framework of new educa- 
tional theories as activity theory and constructivism 
to give remote access to schools to advanced labo- 
ratory experiments. 

These theories, through canvas as real-time prob- 
lem solving, collaborative, exploratory, and interdis- 
ciplinary learning, consider the learning activity as a 
strongly interactive building process through various 
knowledge exchanges. 

The mixed and augmented reality platform will be 
deployed in eight European countries . It will be used 
for laboratory teaching experiments within fluid 
mechanics, geometry, art and humanities, and hu- 
man science. 



CONCLUSION 

In this article, we presented an overview of recent 
research results in the design of cooperative sys- 
tems for supporting distributed system engineering 
scenarios. A layered multi-viewpoint structuring 
approach is adopted as a design framework. It 
covers three viewpoints: functional, architectural, 
and technological views involved in CSCW environ- 
ments. It also identifies three interaction levels: 
cooperation, coordination, and communication. 
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KEY TERMS 

Application-Sharing Space: A groupware tool 
that produces multiple distributed remote views of a 
particular space. Any single-user application put 
under the control of the particular space can be 
viewed remotely and controlled by the group mem- 
bers that have access to this space. Therefore, the 
application-sharing space transforms any single- 
user application put under its control into a multi-user 
shared application. 

Asynchronous Cooperation: Members are not 
present in the same time within the cooperation 
group (no co-presence). They communicate with 
asynchronous media (e.g., e-mail messages) on top 
of extended and improved message systems. 

Computer Supported Cooperative Work 
(CSCW): According to Ellis, et al. (1991), CSCW 
platforms are “computer-based systems that sup- 



port groups of people engaged in a common task (or 
goal) and that provide an interface to a shared 
environment” (p. 40). Another similar definition 
(Kraemer et al., 1988) is computer-based technol- 
ogy that facilitates two or more users working on a 
common task. 



c 



Cooperation: A group of people working on a 
common global task. 

Coordination: Enabling and controlling the co- 
operation among members of a group of human or 
software-distributed agents. It can be considered as 
software glue for groupware tools, including archi- 
tectural and behavioral issues. Coordination includes 
several synchronization and management services. 

Electronic Board: A classic groupware tool 
that supports the functionalities of a traditional 
whiteboard (sharing sketches, pointing, annotating) 
through a set of distributed computers. 

Groupware: The software and technological 
part of CSCW. It contains application studies and 
platforms adapted to groups and supporting group 
working. 

Synchronous Cooperation: Members are 
present in the same time within the cooperation 
group (co-presence). The communications among 
them are interactive and made with live media, such 
as videoconferences or application sharing spaces. 

System Engineering: The branch of engineer- 
ing concerned with the development of large and 
complex systems. It includes the definition and setup 
of adequate and controlled processes for the design 
and development of these complex systems. 
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INTRODUCTION 

Addressing intercultural considerations is increas- 
ingly important in the product development pro- 
cesses of globally active companies. 

Culture has a strong relevance for design, which 
means that culture influences the daily usage of 
products via design. During humans’ “growing up” 
time and socialization, the daily usage and the inter- 
action with different products are very important. 
This forms the user behavior because it supports the 
forming of users’ basic interaction styles. Education 
and interaction styles vary in different cultures. 
Hence, it should be interesting to look at the relation 
of users’ behavior and culture. 



BACKGROUND 

For many years, researchers of social sciences have 
analyzed cross-cultural differences of interpersonal 
communication styles and behavior (Hofstede, 1991; 
Trompenaars, 1993). During the last ten years, 
usability engineers also have focused on intercul- 
tural differences of icon/color coding, navigation, 
and other human-machine interface components 
(Hoft, 1996; Marcus, 1996; Prabhu & Harel, 1999). 
The time for product development and the time 
between redesign efforts both are becoming shorter 
than ever before. Consequently, to prepare effec- 
tively interactive products for the global market, one 
must engineer their intercultural attributes of such a 
market. One basic step for intercultural engineering 
is the analysis of user requirements in different 
cultures. 

Within the framework of this project described in 
the following, a requirement analysis of user needs 
in mainland China was conducted as first step of a 
human machine system localization. But, why do you 



have to localize your product? Why is it important to 
know the culture of a target user group? 

Bourges-Waldegg (2000) says: 

...Design changes culture and at the same time is 
shaped by it. In the same way, globalization is a 
social phenomenon both influencing and 
influenced by design, and therefore by culture..., 
both globalization and technology have an effect 
on culture, and play a role in shaping them. 

This article describes the analysis of culture- 
specific information from users in Mainland China 
and the application of different methods for different 
design issues, especially in an intercultural context. 
Selected results of this analysis will also be pre- 
sented. The analysis and their results are part of the 
project INTOPS-2: Design for the Chinese market, 
funded by several German companies. 

The project was carried out by the Center for 
Human Machine Interaction (the University of 
Kaiserslautern, Germany). The aim of the project 
Intops-2 was to find out the influence of culture on 
the design of human machine systems, and to ana- 
lyze local specifics for the area of machine tools and 
the requirement analysis of the Chinese user from 
that area. 



USER REQUIREMENT ANALYSIS IN 
MAINLAND CHINA 

Study Outline 

The requirement analysis in China was carried out at 
the end of 2000. During two months, 32 Chinese 
organizations in Shanghai, Beijing, and Chongqing 
were visited, of which 26 were Chinese industrial 
enterprises (including Chinese machine tool produc- 
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ers and some machine users). The other six organi- 
zations included some governmental organizations 
for import administration and some research insti- 
tutes for machine user-interface design in China. 
The analysis was conducted by a native speaking 
Chinese researcher from the Center for Human- 
Machine-Interaction in Kaiserslautern, Germany. 

Study Methods 

The following three investigation methods were 
applied in the INTOPS-2 project: test, questionnaire, 
and interview. These methods have been followed 



by another similar previous project, namely INTOPS- 1 
(see also Ziihlke, Romberg, & Rose, 1998). The 
tests are based on the analysis of the following: 
Choong and Salvendy (1998); Shih and Goonetilleke 
(1998); Dong and Salvendy (1999); Piamonte, 
Abeysekera, and Ohlsson (1999); and Rose (2002a). 
However, to find out more details of the Chinese 
user requirements, a few new tests and a more 
detailed questionnaire and interview checklist have 
been developed for the INTOPS-2 project. An over- 
view of all the implemented tests is presented in 
Table 1. 



Table 1. Overview of implemented tests 



No 


Test 


Aim 


Material 


Subject 


Analysis 


1 


Preference to 
color 

composition 
for machine 
tools 


Eliciting preferred 
color composition 
and difference to 
German one 


10 cards with 
differently colored 
machine tools. 


No special 
requirement 


Average 
preference 
degree for each 
composition 


2 


Recalling 
performance 
for graphical 
information vs. 
textual 
information 


Testing information 
processing ability 
for different 
information 
presentation 
methods 


3 pieces of paper 
with different ways 
of info, 
presentations: 
only Text 
only picture 
text & picture 


No special 
requirement 


Average recall 
rate for each 
method. 
Characters for 
better recalled 
info. 


3 


Understanding 
of color coding 


Testing the 
understanding of 
standard color 
coding and 
difference to 
German one 


7 standard colors of 
IEC 73. 3 groups of 
concepts in daily 
life and at work (5 
in each one) 


Matching for 
concepts at work 
only for machine 
operators 


The color 
association rate 
for each 
concept 


4 


Symbol 

understanding 


Testing the 
understanding of 
standard ISO 
symbols and 
eliciting the 
preferred symbol 
characteristics for 
information coding 


Icons from ISO and 
Windows. 2 kinds 
of materials: 

18 icons, each with 
3 possible 
meanings; 

14 meanings, each 
with 3 possible 
icons 


Machine 

operators 


Average 
recognition 
rate for each 
icon 

Character for 
better matched 
icon 


5 


Familiarity 
with Windows 
interface 


Testing the 
familiarity with the 
Windows interface 


Integrated with 
Test 4 


Machine 

operators 


Recognition 
rate for 

Windows icons 


6 


Concept of 
grouping 
(Type of Card 
Sorting) 


Eliciting the 
grouping rule and 
the difference to 
German one 


74 cards with 
different CNC 
machine functions 


Only with 
experienced 
CNC machine 
operators 


Preferred 
structure for 
grouping 


7 


Preference for 
screen layout 


Eliciting the 
familiar screen 
layout characters 
and difference to 
German one 


Over 20 different 
cards in form and 
size representing 
the screen elements 


CNC machine 
operators 


Preferred 
layout for 
different 
screen 
elements 


8 


Understanding 
of English 
terms 


Testing the English 

understanding 

ability 


One table with 54 
English technical 
terms 


Machine 

operators 


Average 

understanding 

rate 

Character for 
better 

understanding 
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Figure 1. Part of the questionnaire, originally used in Chinese 



Have you panned to purchase new production equipment in 2001? 
□ yes □ no 



How important are the following factors in your purchasing of machines? (higher value is more important) 



function range 

machining quality 

application of new technologies 
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operation convenience 

requirements on running environment 

machine s pnce 
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others. 
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The main categories of questionnaire and inter- 
view checklist are the following: 

• Questionnaire: Basic information about vis- 
ited companies, information about machine pur- 
chasing, requirements on machine service, re- 
quirements on machine user-interface and re- 
quirements on machine technical documenta- 
tion (e.g., Figure 1) 

• Interview: B asic information about visited com- 
panies, information about machine purchasing, 
application situation of imported machines in 
practice, requirements on machine user inter- 
face, requirements on technical documentation, 
requirements on service, information about work 
organization, and training. 

Study Conditions 

The tests were conducted in the break room. The 
subjects were all Chinese machine operators. A total 
of 42 users have participated in the tests, most of 
them in more than one test. The total test span was 
controlled within 30 minutes, which has proven in 
practice as almost the maximum acceptable time for 
one subject being willing to participate. Due to this 
constraint, none of the subjects had participated in 
more than four tests. 

The interviews were mainly conducted in the 
office of the interviewees. All interviews were con- 
ducted in Chinese. Since the investigator was a 



native-speaking Chinese, there was no problem in 
language communication. 

In total, 35 main interviewees were involved. 
Fifty-eight percent of the interviewees were ma- 
chine tool users in China. They came from different 
application areas, like automobile, motorcycle, and 
machine tool industrial sector. The interviewees 
were either responsible for machine purchasing 
decisions (the chief manager, chief engineer, and 
factory/workshop director) or responsible for ac- 
tual machine applications in practice (the chief 
engineer, factory/workshop director, equipment 
engineer, and technician). 

Most of the questionnaires were filled out di- 
rectly after the interviews by the same interviewees 
or by other people in charge. A total of ^question- 
naires were obtained in the investigation. About 
63% of the questionnaires were filled out by the 
state owned firms in China. Forty-seven percent of 
the questionnaires were filled out by machine tool 
users in China. 

Brief Results 

As an example for test results, the results of the 
English understanding test revealed that Chinese 
machine operators have generally unsatisfactory 
English understanding for safe and effective ma- 
chine operation (see test in Table 1 and Rose, Liu, 
& Ziihlke, 2001). Table 2 shows the results for the 
understanding rate of the English terms. 
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Table 2. 



Terms 


OK 


OFF 


Stop 


Home 


Help 


kg 


ON 


Start 


mm 


End 


Enter 


Del 


% 


95,7 


82,6 


82,6 


73,9 


73,9 


73,9 


69,6 


69,6 


69,6 


65,2 


56,5 


52,2 




Terms 


Exit 


Reset 


Shift 


Menu 


Edit 


ipm 


inch 


Disk 


Undo 


Esc 


Icon 


PIN 


% 


43,5 


39,1 


39,1 


39,1 


30,4 


26,1 


26,1 


26,1 


21,7 


21,7 


13,0 


8,7 



Therefore, text coding in English should be 
restricted in a very limited scope (see also del Galdo, 
1990). More detailed requirements at this point are 
summarized in the following: 

1 . The most important information should at any 
time be encoded using Chinese text (and with 
combination to other information coding meth- 
ods such as color, shape, and position, etc.). 

2. Only when it is really necessary, English texts 
which express some general machine opera- 
tions such as On, Off, Start, Stop, and so forth, 
and are closely related to some daily use such 
as Ok, Yes, Help, Home, and so forth, could be 
applied for information coding. But the applica- 
tion of most English abbreviation in text label- 
ing should be avoided. 

3. In many cases, other information coding meth- 
ods should be applied together with Chinese 
text to ensure that the machine can be well 
maintained by foreign-service engineers. 

Chinese machine operators have often very little 
experience with the operation of a Microsoft Win- 
dows user interface. Therefore, the application of 
the W indows user interface for machine operation in 
China will meet some problems at the beginning. 
Furthermore, the more freely structured dialogue 
provided by Ms W indows user interface, in compari- 
son to guided dialogue, also could make Chinese 
machine operators unsure about their operations. 
The results suggested that the application of the Ms 
Windows user interface in China is not encouraged 
for machine operation at present time. 

The most obviously different requirements in 
menu structure come from the different working 
organization in China. A Chinese operator’s task 
range is narrower than that for a German machine 



operator. In fact, the interviews have pointed out 
that the Chinese users require a self-defined user 
interface to fit their individual production tasks. In 
conclusion, the menu structure should correspond 
well to the operation tasks of the Chinese machine 
operators and the cultural-based organizational and 
structural diversity. The new menu structure should 
be characterized by separated (and hierarchical) 
access rights of different machine operators to 
different machine functions, with the menu structure 
for each operator group thus simplified. For machine 
operators, the operation functions should be much 
simpler. This could also make the operation of 
specific machine functions more quickly reachable. 

Because the actual working organization for each 
customer is different, it is impossible for machine 
producers to provide an individual menu structure 
for each customer. Generally, there should be one 
consistent menu structure provided by machine pro- 
ducers and the flexibility of users to configure the 
structure specifically for a particular working orga- 
nization. Based on this consideration, the 
modularization of the menu structure to leave room 
for further adaptation is a good design strategy. 
Then, the menu structure for a specific customer 
could be configured according to individual needs. 

Chinese customers have a very low assessment 
of their own machine operators. The operators 
themselves also have very low self-evaluations. 
Both groups often made a remark that operators 
have quite low qualifications and could not under- 
stand complicated machine operations. This status 
suggests that machine operators could have poten- 
tial fear to actively interact with the machine and 
would prefer to follow the definite operation instruc- 
tions provided by the system. Consequently, the 
dialogue system must provide more error tolerance 
for the operation and should have a very clear guide 
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for workers to enable them to follow the operation 
process. 

The questionnaire’s results show that the main 
problem of the Chinese machine operators is based 
on a bad understanding of different machine func- 
tions and different operation processes. Accord- 
ingly, the user interface must provide a navigation 
concept that presents general machine functions and 
operation processes in an overview to enable opera- 
tors to establish easily a clearer overall image of 
machine operation. 

FUTURE TRENDS 

Finally, this article should mention cultural diversity 
in user requirements and their relevance in time of 
globalization. Mainland China is only an example for 
a new and increasing market with an interesting and 
diverse user culture. 

Product localization is not possible without re- 
quirement analysis in the target user culture. Only 
this can guarantee to meet the user expectations on 
product use in the future. This method is, however in 
internationalization or localization context, a needed 
base to engineer user-oriented human machine sys- 
tems (see Rose, 2002). 

For researchers and developers working on ad- 
vancing user-oriented design, one must realize that, 
in time of globalization, the culture-orientation is an 
essential component for successful usability and 
user friendliness. Culture is an influence factor on 
user interface design, and it is also an element of 
user-experiences. Engineers of products for the 
global market have to address this issue (see Rose, 
2001 ). 

CONCLUSION 

This article described the INTOPS-2 project and 
summarized general advice of lessons learned for 
such intercultural study. 

First , it needed around nine months to prepare 
the study, which included analyzing cultural specif- 
ics, studying regulations and analyzing system docu- 
mentations, construction of test, and questionnaire 
materials, building up an industrial-based and founded 



working group to discuss problems detailing the 
focus of the study, contacting firms, arranging visits, 
conducting pre-tests, and many other tasks. Careful 
preparation is very important for completing an 
evaluation of an intercultural project. For such field 
analysis, everything must be made easy 

Second, all the test material was translated to 
Chinese. This is the only acceptable method of 
localization research and a guarantee for the collec- 
tion of trustworthy relevant data. 

Third, the investigation in Mainland China was 
carried out by a native-speaking Chinese member of 
the team. FTis background knowledge was product 
design and usability, and he was deeply involved in 
the preparation of the study. 

Without these frame conditions, it is not possible 
to get real cultural impact data from a user target 
culture. Culture-oriented design will continue to be a 
challenge. The sooner software engineers start to 
integrate cultural diversity of users into international 
product development, the sooner products become 
more fit for a global market. 
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KEY TERMS 

CNC Machine: Tool machine with a computer 
numeric control; a standard in the mechanical engi- 
neering field. 
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Culture-Oriented Design: Specific kind of 
user-oriented design. Also focused on the user as a 
central element of development, but taking into 
account the cultural diversity of different target user 
groups. 

INTOPS-1: Requirements of the non-European 
market on machine design. This project was founded 
by the German ministry of education and research 
(1996-1998). 

INTOPS-2: Requirements of the user in Main- 
land China for Human Machine Systems in the area 
of production automation. This project was founded 
by several companies from Germany and Switzer- 
land (2000-2001). 



Product Localization: Optimization of a prod- 
uct for a specific target culture; could also be the 
development of a product only and alone for this 
specific target culture (not so often). 

Target User Group: Refers to a focus user 
group, which a product or a development process 
specifically targets or aims at. The qualities of such 
a group are a relevant baseline for the developer 
(used as orientation and target for the development 
process and the features of the new designed prod- 
uct). Their requirements are relevant for the orien- 
tation of the developer. 

User-Oriented Design: Development approach 
with a focus on user requirements and user needs as 
a basis for a system or product development. 
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INTRODUCTION 

Computer-mediated communication between hu- 
mans is becoming ubiquitous. Computers are in- 
creasingly connected via high-speed local and wide- 
area networks, and via wireless technologies. High 
bandwidth interaction is increasing communication 
speed, offering the possibility for transmission of 
images, voice, sound, video and formatted data as 
well as text. Computer technologies are creating the 
possibility of entirely new interfaces of human- 
machine interaction, and entirely new virtual “spaces” 
for human-human interaction. As a collectivity, these 
new spaces of communication are known as 
cyberspace. 

Human-human interaction is the foundation of 
culture. Vygotsky and Luria’s (1994) model of cul- 
tural development highlights the need to consider the 
culture(s) of cyberspace (“cyberculture(s)”) in any 
examination of computer-mediated human commu- 
nications, because it invokes both the communica- 
tive and behavioural practices that humans employ 
as they interact with their environment. 

BACKGROUND 

Vygotsky and Luria (1994) propose that human 
beings use multiple psychological structures to me- 
diate between themselves and their surroundings. 
Structures classified as signs include linguistic and 
non-linguistic mechanisms of communication; struc- 
tures classified as tools encompass a wide range of 
other behavioural patterns and procedures that an 
individual learns and adopts in order to function 
effectively within a culture or society. Together, 
signs and tools allow individuals to process and 
interpret information, construct meaning and inter- 
act with the objects, people and situations they 
regularly encounter. When these elaborate mediat- 
ing structures, finely honed to navigate a specific 



environment, encounter a different one, they can 
malfunction or break down completely. 

In the context of the Internet, human beings do 
not simply interact with digital interfaces. Rather, 
they bring with them into cyberspace a range of 
communicative and behavioural cultural practices 
that impact their ability to interact with technology 
interfaces, with the culture of the virtual spaces they 
enter, and with other humans they encounter there. 
Their individual and group cultural practices may or 
may not “match” the practices of the virtual culture(s) 
of cyberspace. Some investigators have gone as far 
as to suggest that the sociocultural aspects of com- 
puter-mediated human interaction are even more 
significant than technical considerations of the inter- 
face in the successful construction and sharing of 
meaning. This article surveys current theories of the 
nature and construction of cyberculture(s), and of- 
fers some brief thoughts on the future importance of 
cyberculture studies to the field of HCI. 

KEY DEBATES IN 
CYBERCULTURE STUDIES 

Perhaps the most striking feature of the body of 
current literature on cyberculture is the polarization 
of debate on almost every issue. A few authors 
examine these emerging paradoxes directly. Fisher 
and Wright (200 1 ) and Poster (200 1 ) explicitly com- 
pare and contrast the co-existing utopian and 
dystopian predictions in discourse surrounding the 
Internet. Levy (2001a), Poster (2000), and Jordan 
(1999) go as far as to suggest that the very nature of 
the Internet itself is paradoxical, being universalizing 
but non-totalizing, liberating and dominating, em- 
powering and fragmenting, constant only in its 
changeability. Most writers thus far have tended, 
however, to theorize for one side or the other within 
polarized debates, as will become evident next. 
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Utopia or Dystopia? 

While not explicitly espousing technological instru- 
mentalism (an assumption that technology is “cul- 
ture neutral”), a number of writers offer utopian 
visions for the so-called Information Superhighway. 
Such theorists predict that the emancipatory poten- 
tial of Internet communications will help to bring 
about new forms of democracy and new synergies 
of collective intelligence within the Global Village of 
cyberspace (Ess, 1998; Levy, 2001a, 2001b; Morse, 
1997). 

Their detractors argue that these writers ignore 
the reality that culture and cultural values are inex- 
tricably linked to both the medium and to language 
(Anderson, 1995; Benson & Standing, 2000; Bijker 
& Law, 1992; Chase, Macfadyen, Reeder, & Roche, 
2002; Gibbs & Krause, 2000; Pargman, 1998; Wil- 
son, Qayyum, & Boshier, 1998) and that cybercul- 
ture “originates in a well-known social and cultural 
matrix” (Escobar, 1994, p. 214). These theorists 
more commonly offer dystopian and technologically 
deterministic visions of cyberspace, where money- 
oriented entrepreneurial culture dominates (Castells, 
2001), which reflects and extends existing hierar- 
chies of social and economic inequality (Castells, 
2001; Escobar, 1994; Jordan, 1999, Keniston& Hall, 
1998; Kolko, Nakamura, & Rodman, 2000; Luke, 
1997; Wilson et al., 1998), and which promotes and 
privileges American/Western cultural values and 
the valorization of technological skills (Anderson, 
1995; Castells, 2001 ; Howe, 1998; Keniston & Hall, 
1998; Luke, 1997; Wilson et al., 1998). 

These and other thematically polarized argu- 
ments about cyberculture (such as “Internet as locus 
of corporate control” versus “Internet as new social 
space” (Levy, 2001a) or “Internet as cultural con- 
text” versus “Internet as a cultural artifact” 
(Mactaggart, 2001) are evident in the philosophical 
arguments underlying work listed in other sections of 
article. 

Modern or Postmodern? 

A second major division in theoretical discussions of 
the nature and culture of the cyberspace is the 
question of whether the Internet (and its associated 
technologies) is a modern or postmodern phenom- 



enon. Numerous writers frame the development of 
Internet technologies, and the new communicative 
space made possible by them, as simply the contem- 
porary technical manifestation of “modern ideals, 
firmly situated in the revolutionary and republican 
ideals of liberty, equality and fraternity” (Levy, 
2001a, p. 230). Emphasizing the coherence of cur- 
rent technologies with ongoing cultural evolution(s), 
Escobar (1994) discusses the Western cultural foun- 
dations of technological development, and Gunkel 
and Gunkel (1997) theorize that the logic of 
cyberspace is simply an expansion of colonial Euro- 
pean expansionism. Castells (2001) sees cybercul- 
ture as emerging from an existing culture of scien- 
tific and technological excellence “enlisted on a 
mission of world domination” (p. 60). Orvell ( 1998) 
pointedly argues that “debates about postmodernity 
have evinced a kind of amnesia about the past” (p. 
13) and claims that cyberspace and virtual reality 
technologies are continuous with the Romantic imagi- 
nation as it developed in the 1830s and 1840s. 
Disembodiment, he argues, is not a new product of 
the modern age, but was the “triumph of the Roman- 
tic imagination” (p. 16). 

More recently, other writers have begun to envi- 
sion the cultural sphere of cyberspace as radically 
new, postmodern, and signifying a drastic break with 
cultural patterns of community, identity and commu- 
nication. Lor example, Webb (1998) suggests that 
the frontier metaphors of cyberspace symbolize a 
postmodern shift from human/territorialized to non- 
human/deterritorialized computer-mediated environ- 
ments. Poster (2000) claims that Internet technolo- 
gies have actually brought into being a “second order 
of culture, one apart from the synchronous exchange 
of symbols and sounds between people in territorial 
space” (p. 13). He predicts that the cultural conse- 
quences of this innovation must be “devastation for 
the modern” (p. 13), and (2001) reformulates for this 
context the propositions of postmodern theorists 
such as Loucault, Heidegger, Deleuze, Baudrillard 
and Derrida who challenge modernist notions of 
progress, definable “authentic” selfhood and the 
existence of absolute foundations for or structures 
of, knowledge (for a short review of postmodern 
thought see Schutz, 2000). Poster argues effectively 
that postmodern perspectives on life and culture that 
go beyond old notions of fixed social structures may 
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be most relevant for the fluid, dynamic, and even 
contradictory cultured environment of cyberspace. 

Cybercultural Values 

Are the values of cyberculture simply the imported 
values of existing non-virtual cultures? Or does cy- 
berculture represent a newly evolving cultural mi- 
lieu? Various authors speculate about the nature and 
origins of cybercultural values. Anderson (1995) 
argues that cyberculture values are “speed, reach, 
openness, quick response” (p. 13). Castells (2001) 
believes that “hacker culture” is foundational to 
cyberculture, and carries with it meritocratic values, 
an early notion of cybercommunity, and a high valu- 
ing of individual freedom. Jordan (1999) contends 
that it is power which structures culture, politics and 
economics, and theorizes the existence of 
“technopower” as the power of the Internet elite that 
shapes the normative order of cyberculture. Later 
(Jordan, 2001), and similarly to Castells, he elabo- 
rates on the Anglo-American language and culture 
bias of cyberspace, which he argues is founded on 
competition and informational forms of libertarian 
and anarchist ideologies. Knupfer (1997) and Morse 
(1997) explore the gendered (masculine) nature of 
cyberculture, while Kolko et al. (2000) theorize that 
(the predominantly Anglo-American) participants in 
new virtual communities bring with them the (pre- 
dominantly American) values of their home cultures. 
As a result, and as Star (1995) had already pointed 
out, “there is no guarantee that interaction over the 
net will not simply replicate the inequities of gender, 
race and class we know in other forms of communi- 
cation” (p. 8). Essays in Shields’ (1996) edited col- 
lection examine features of cybercultural values and 
practice such as attitudes to censorship, social inter- 
action, politics of domination and gendered practices 
of networking. 

Others, however, speculate that cyberspace is the 
site of creation of an entirely new culture. Levy 
(2001a) argues, for example, that cyberculture ex- 
presses the rise of “a new universal, different from 
the cultural forms that preceded it, because it is 
constructed from the indeterminateness of global 
meaning” (p. 100), and Healy (1997) characterizes 
cyberspace as a middle landscape between civiliza- 
tion and wilderness, where new cultural directions 
and choices can be selected. 



Subcultures of/in Cyberspace 

Are subcultures identifiable within the cultural mi- 
lieu of cyberspace? A number of theorists discuss 
the online cultures of specific subgroups: Castells 
(2001) discusses the “hacker culture” in detail, 
Leonardi (2002) investigates the online culture and 
website designs of U.S. Hispanics, and Gibbs and 
Krause (2000) explore the metaphors used by dif- 
ferent Internet subcultures (hackers, cyberpunks). 

Rather than simply itemizing and describing 
cyberspace subcultures, however, a growing num- 
ber of studies are exploring the marginalization in or 
lack of access to cyberspace of some cultural 
groups. Howe (1998) argues, for example, that 
radical differences in cultural values make 
cyberspace inhospitable for Native Americans. 
Keniston and Hall (1998) offer statistics on the 
English language and Western dominance of the 
Internet; they discuss the reality that 95% of the 
population of the world’s largest democracy — In- 
dia — are excluded from computer use because they 
lack English language fluency. Anderson (1995) 
suggests that the “liberal humanist traditions of 
Islamic and Arab high culture” (p. 15) are absent 
from the world of cyberspace, not because they do 
not translate to new media, but because they are 
literally drowned out by the cultural values attached 
to the dominant language and culture of cyberspace 
as it is currently configured. Similarly, Ferns (1996), 
Morse (1997) and Knupfer (1997) suggest that the 
gendered culture of cyberspace has tended to ex- 
clude women from this virtual world. Interestingly, 
Dahan (2003) reports on the limited online public 
sphere available to Palestinian Israelis relative to 
the Jewish majority population in Israel. Rather 
than being the result of Anglo-American cultural 
domination of the Internet, this author convincingly 
argues that this imbalance demonstrates that “ex- 
isting political and social disenfranchisement” (*H 1) 
in Israeli society is simply mirrored in cyberspace. 
More generally, Davis (2000) reports on the poten- 
tial for disenfranchisement from the “global 
technoculture” depending on their different mani- 
festations of “a diversity of human affinities and 
values” (p. 105). Staid and Tufte (2002) meanwhile 
present a series of reports on cyberspace activities 
of a number of discrete and identifiable minority 
communities: rural black African males in a South 
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African university, South Asian families in London, 
women in India, Iranian immigrants in London, and 
young immigrant Danes. 

In Search of Utopia: Cultural Impact 
and Technology Design 

If we accept cyberculture as a value system that 
embodies free speech, individual control, and a 
breaking down of the barrier of distance, does this 
imply that existing (or perceived) inequalities can be 
corrected in pursuit of this utopia? Keniston and Hall 
(1998) attempt to detail issues that must be faced in 
such efforts: attention to nationalist reactions to 
English-speaking elites, development of standard- 
ized forms for vernacular languages, and the “real 
(and imagined)” (p. 331) challenges faced by North 
American software firms when dealing with South 
Asian languages. Benson and Standing (2000) con- 
sider the role that policy plays in the preservation of 
cultural values, and present a new evaluation frame- 
work for assessing the impact of technology and 
communication infrastructure on culture. Wilson et 
al. (1998) propose that a framework based on 
Chomsky’s 1989 analysis of mass media can help 
determine the extent to which American corpora- 
tions and institutions dominate (and thus determine 
the culture of) cyberspace. 

Of particular relevance to the field of HCI may 
be some preliminary efforts to tailor online environ- 
ments to particular cultural groups. For example, 
Turk (2000) summarizes recent attempts to establish 
relationships between the culture of users and their 
preferences for particular user interfaces and WWW 
site designs, while Leonardi (2002) reports on a 
study of the manifestation of “Hispanic cultural 
qualities” (p. 297) (cultural qualities perceived within 
the US-American context to derive from “Spanish” 
cultures) in Web site design, and makes design 
recommendations for this community. Heaton (1998) 
draws on Bijker and Law’s (1992) notion of “tech- 
nological frame” to explain how Japanese designers 
invoke elements of Japanese culture in justifying 
technical decisions. The theoretical and method- 
ological challenges of this approach to technology 
design are explored in greater detail in “Internet- 
Mediated Communication at the Cultural Interface” 
(Macfadyen, 2006), contained in this encyclopedia. 



FUTURE TRENDS 

An ongoing tension is apparent in the existing cyber- 
culture literature between theories that assume “im- 
portation of pre-existing cultures” and theories that 
anticipate new “cultural construction” in the emerg- 
ing communicative spaces of cyberspace. In the 
former, theorists have tended to identify and charac- 
terize or categorize groups of communicators in 
cyberspace using rather deterministic or essentialist 
(usually ethnic) cultural definitions, and have then 
theorized about the ways in which such groups 
import and impose, or lose, their cultural practices in 
the cyberspace milieu. Sociopolitical analyses have 
then built on such classifications by positioning 
cyberspace communicators as constrained or privi- 
leged by the dominant cyberculture. While problem- 
atic, application of such static definitions of culture 
has allowed some preliminary forays into develop- 
ment of so-called culturally appropriate “versions” 
of human-computer interfaces and online environ- 
ments. 

A continuing challenge, however, is perceived to 
be the lack of an adequate theory of culture that 
would allow analysis of the real complexities of 
virtual cultures and virtual communities (Ess, 1998), 
and that could guide better technology and interface 
design. Recently, however, a few theorists have 
begun to question the utility of static and “classifica- 
tory” models or definitions of culture. Abdelnour- 
Nocera (2002) instead argues that examination of 
“cultural construction from inside the Net” (*H 1) is 
critical. Benson and Standing (2000) offer an en- 
tirely new systems theory of culture that emphasizes 
culture as an indivisible system rather than as a set 
of categories. Importantly, challenging postmodern 
theorists such as Poster (200 1 ) argue that the Internet 
demands a social and cultural theory all its own. 
Common to these theories is an underscoring of the 
dynamic nature of culture, and of the role of individu- 
als as active agents in the construction of culture — 
online or off-line. 



CONCLUSION 

Whenever humans interact with each other over 
time, new cultures come into being. In cyberspace, 
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networked computer technologies facilitate and shape 
(or impede and block) the processes of cultural 
construction. Although debates over the nature of 
cyberculture continue to rage, one point is increas- 
ingly clear: in the field of human-computer interac- 
tion, it is no longer sufficient to focus solely on the 
interface between individual humans and machines. 
Any effort to examine networked human communi- 
cations must take into consideration human interac- 
tion with and within the cultures of cyberspace that 
computer technologies bring into being. 
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worked computers that make worldwide digital com- 
munications possible, “cyberspace” is understood as 
the virtual “places” in which human beings can 
communicate with each other, and that are made 
possible by Internet technologies. Levy (2001a) 
characterizes cyberspace as “not only the material 
infrastructure of digital communications but... the 
oceanic universe of information it holds, as well as 
the human beings who navigate and nourish that 
infrastructure.” 
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Dystopia: The converse of Utopia, a Dystopia is 
any society considered to be undesirable. It is often 
used to refer to a fictional (often near-future) soci- 
ety where current social trends are taken to terrible 
and socially-destructive extremes. 

Modern: In the social sciences, “modern” re- 
fers to the political, cultural, and economic forms 
(and their philosophical and social underpinnings) 
that characterize contemporary Western and, argu- 
ably, industrialized society. In particular, modernist 
cultural theories have sought to develop rational and 
universal theories that can describe and explain 
human societies. 



KEY TERMS 

Culture: Multiple definitions exist, including es- 
sentialist models that focus on shared patterns of 
learned values, beliefs, and behaviours, and social 
constructivist views that emphasize culture as a 
shared system of problem-solving or of making 
collective meaning. The key to the understanding of 
online cultures — where communication is as yet 
dominated by text — may be definitions of culture 
that emphasize the intimate and reciprocal relation- 
ship between culture and language. 

Cyberculture: As a social space in which hu- 
man beings interact and communicate, cyberspace 
can be assumed to possess an evolving culture or set 
of cultures (“cybercultures”) that may encompass 
beliefs, practices, attitudes, modes of thought, 
behaviours and values. 

Cyberspace: While the “Internet” refers more 
explicitly to the technological infrastructure of net- 



Postmodern: Theoretical approaches charac- 
terized as postmodern, conversely, have abandoned 
the belief that rational and universal social theories 
are desirable or exist. Postmodern theories also 
challenge foundational modernist assumptions such 
as “the idea of progress,” or “freedom.” 

Technological Determinism: The belief that 
technology develops according to its own “internal” 
laws and must therefore be regarded as an autono- 
mous system controlling, permeating, and condition- 
ing all areas of society. 

Technological Instrumentalism: The view that 
technologies are merely useful and “culture-neu- 
tral” instruments, and that they carry no cultural 
values or assumptions in their design or implementa- 
tion. 

Utopia: A real or imagined society, place, or 
state that is considered to be perfect or ideal. 
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INTRODUCTION 

Design frameworks are a phenomena appearing in 
the field of new media (e.g., Brook & Oliver, 2003; 
Fiore, 2003; Dix, Rodden, Davies, Trevor, Friday, & 
Palfreyman, 2000; Taylor, Sumner, & Law, 1997). 
They appear to be a response to the multi-disciplin- 
ary nature of the field and have a number of things 
in common. They are usually developed in response 
to a perceived lack of common understanding or 
shared reference. Frameworks often advocate a set 
of principles, a particular ethos, or expound a philo- 
sophical position, within which a collection of meth- 
ods, approaches, tools, or patterns are framed. They 
aim to support design analysis, decision-making and 
guide activity, and provide a common vocabulary for 
multi-disciplinary teams. In contrast to some design 
methods and models, they tend to be broad and 
encompass a wider area of application. Rather than 
prescribe a single “correct” way of doing something, 
they provide a guiding structure that can be used 
flexibly to support a range of activity. This article 
describes one design framework, the experience 
design framework (Jefsioutine & Knight, 2004) to 
illustrate the concept. 

BACKGROUND 

The experience design framework (EDF) illustrates 
a number of the features of design frameworks 
identified previously. It was developed in response 
to the low take-up of user-centred design observed 
by the authors and identified in the literature (e.g., 
Landauer, 1996; Nielsen, 1994). For example, Sade 
(2000, p. 21) points out that some of the large-scale 
user-centred design (UCD) methods “do not suit the 
varied and fast paced consulting projects of a design 



firm.” Nielsen suggests that one of the key reasons 
why usability engineering is not used in practice is 
the perceived cost. He argues that a “discount 
usability engineering” approach can be highly effec- 
tive and describes a set of “simpler usability meth- 
ods” (Nielsen, 1994, pp. 246-247). Eason and Harker 
(1988) found that, as well as perceived cost and 
duration, user-centred methods were not used be- 
cause designers felt that useful information was 
either not available when needed or was not relevant 
and that methods did not fit in with their design 
philosophy. 

The authors thus set about identifying a set of 
user-centred methods that would be cost effective, 
flexible enough to apply to any design life cycle and, 
most importantly, would be useful and relevant to the 
needs of the designer. Through a combination of 
literature reviews and application to practice, the 
authors identified different aspects of designing a 
user experience and the way in which these aspects 
can be drawn together to focus design research and 
practice. The EDF is thus based on the principles of 
user-centred design and represents a way of using a 
range of methods to achieve a set of qualities that 
work at all dimensions of experience. 

USER-CENTRED DESIGN 
PRINCIPLES (UCD) 

Human-centred design processes for interactive 
systems identifies the following characteristics of a 
user-centred design process: “The active involve- 
ment of users and a clear understanding of user and 
task requirements; An appropriate allocation of func- 
tion between users and technology; The iteration of 
design solutions; Multidisciplinary design” (Interna- 
tional Organization for Standardization, ISO/IEC 
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13407, 1999). Additionally, Gould and Lewis ( 1985) 
emphasise the importance of early and continual 
user testing and integrating all aspects of usability. 

These principles of UCD set out a clear approach 
around which to plan a design life cycle, but they 
focus very much on design for usability. The EDF 
proposes that the same principles be applied to other 
qualities of design. 

Qualities, Dimensions and Effectors of 
an Experience 

It was felt that one of the reasons UCD methods 
were seen as irrelevant and limited was that the 
traditional focus on usability does not capture other 
aspects of the user-experience. The EDF identifies 
a broader set of qualities that address the less 
tangible aspects of an experience, such as pleasure 
and engagement. It then identifies the different 
dimensions of experiencing, visceral, behavioural, 
reflective, and social (from Jordan, 2000; Norman, 
2003) that need to be addressed to design a holistic 
user experience. It identifies a number of aspects 
that have an effect on an experience, such as who, 
why, what, where, when, and how, that help to guide 
research, design, and evaluation. 

METHODS AND TOOLS 

Product design, HCI, and human factors research 
are awash with methods and tools that can be used 
to support user-centred design. Generally, tools have 
focused on technological aspects of design, either in 
terms of making coding easier or automating aspects 
of design. Where tools have related to usability, this 
has often focused on evaluation. A less developed 
area is in tools that support the understanding of the 
user at early stages of design and supporting the 
entire user-centred design process (some rare ex- 
amples are HISER, 1994; NIST’s WebCAT, 1998). 

Jordan (2000) describes a collection of empirical 
and non-empirical methods suitable for the “new 
human factors approach” to designing pleasurable 
products. Rather than prescribing a process or a set 
of key methods or tools, the EDF suggests that a 
range of tools and techniques can be employed 
provided they cover four basic purposes of observ- 
ing/exploring, participation/empathy, communicat- 



ing/modelling, and testing/evaluation. Furthermore, 
by applying these methods in the context of the EDF, 
a better understanding of the user experience as a 
whole can be achieved. 

Observation and Exploration 

These methods are about finding out and can be 
drawn from demography, ethnography, market re- 
search, psychology, and HCI (e.g., task analysis, 
field observation, interviews, questionnaires, focus 
groups, affinity diagramming, laddering, and experi- 
ence diaries). The EDF indicates the kind of infor- 
mation that should be sought, such as the range of 
user characteristics including personality, motiva- 
tions, social affiliations, physical or mental disabili- 
ties, and so forth. 

Communicating and Modelling 

These methods serve to communicate the research 
data, design requirements, and ideas to a multi- 
disciplinary team who may not have a common 
vocabulary (e.g., user profiles and personas, use 
cases or task scenarios, scenario-based design, mood 
boards, written briefs and specifications, 
storyboarding, and prototypes). Again, the EDF 
helps to focus the information that is communicated 
on issues pertinent to the whole user experience. 

Participation and Empathy 

These methods represent an approach aimed at 
gaining a deeper understanding and empathy for 
users, socio-political and quality of life issues (e.g., 
immersive methods such ethnographic participant- 
observation and the “eat your own dog food” ap- 
proach). Other methods such as participatory design 
advocate designing with users rather than for them 
(see Schuler & Namioka, 1993). 

Testing and Evaluating 

Gould and Lewis ( 1985) recommend iterative design 
based on empirical testing (e.g., usability testing 
through controlled observation and measurement). 
The EDF broadens the test and evaluative criteria 
from the traditional focus on cognitive and behavioural 
measures, like the time taken to complete a task or 
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Figure 1. The experience design framework 
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the number of errors or deviations from a critical 
path, to include methods such as transcript analysis, 
attitude measurement, and emotional response. 

User-Lab has used the EDF to adapt and focus 
methods for requirements research, brief develop- 
ment, ideation, and testing, and has developed a range 
of services and training based on this research. 

FUTURE TRENDS 

The growing number of design frameworks results 
from an increasingly multi-disciplinary field where 
any single approach is not sufficient. They are, 
perhaps, indicative of a rejection of prescriptive 
approaches to design. Competitive work environ- 
ments demand flexibility, and practitioners are at- 
tracted to tools and methods that can be adapted to 
their working practices. A design framework pro- 
vides a general approach for framing these methods 
and for disseminating multidimensional research in a 
way that is neither daunting nor demanding extensive 
study. It is likely that, as the boundaries between 
disciplines blur, the number of design frameworks 
will continue to grow. 

CONCLUSION 

The EDF is an example of a design framework. It 
was developed to provide a flexible set of methods 



and tools to guide and support the design process 
within a principled context, without being prescrip- 
tive or restricting, and as such can be used within 
any design life cycle, and at any stage. It provides 
a common vocabulary to a multi-disciplinary team 
and also serves to direct research and the develop- 
ment of new methods and tools. Although designed 
primarily for digital media product design, it is broad 
enough to be applied to other areas of product 
design. It is illustrative of what seems to be a 
growing trend in multi-disciplinary approaches to 
design and in bridging the gaps between research 
and practice. 
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KEY TERMS 

Design Framework: An open-ended design 
methodology that combines research and design 
activity. 

Design Methods: Methods, tools, and tech- 
niques employed during research, design, and devel- 
opment. 

Design Research: Exploratory activities em- 
ployed to understand the product, process of design, 
distribution and consumption, and stakeholders’ val- 
ues and influence. 

Discount Usability Methods: A focused set of 
design and evaluation tools and methods aimed at 
improving usability with the minimum resources. 

Human-Centred Design: An alternative name 
for user-centred design (UCD) used in ISO process 
standards. 

Multidisciplinary Design: A collaborative ap- 
proach to design that shares research and design 
activities among a range of disciplines. 

Product Design: An overall term that covers 
the study and execution of design pertaining to 
physical products. 
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INTRODUCTION 

User-centred development (Norman & Draper, 
1986; Vredenburg, Isensee, & Righi, 2001) pro- 
cesses advocate the use of participatory design 
activities, end-user evaluations, and brainstorming in 
the early phases of development. Such approaches 
work in opposition of some software-engineering 
techniques that promote iterative development pro- 
cesses such as in agile processes (Beck, 1999) in 
order to produce software as quickly and as cheaply 
as possible. 

One way of j ustifying the profitability of develop- 
ment processes promoted in the field of human- 
computer interaction (HCI) is to not only take into 
account development costs, but also to take into 
account costs of use, that is, costs related to employ- 
ment, training, and usage errors. Gain, in terms of 
performance (for instance, by providing default val- 
ues in the various fields of a computer form) or in 
reducing the impact of errors (by providing undo 
facilities, for instance), can only be evaluated if the 
actual use of the system is integrated in the compu- 
tation of the development costs. 

These considerations are represented in Figure 
1. The upper bar of Figure 1 shows that development 
costs (grey part and black part) are higher than the 
development costs of RAD (rapid application devel- 
opment), represented in the lower bar (grey part). 
The black part of the upper bar shows the additional 
costs directly attributed to user-centred design. User- 



Figure 1. Comparing the cost of development 
processes 
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centred development processes compensate addi- 
tional costs by offering additional payoffs when the 
system is actually deployed and used. 

The precise evaluation of costs and payoffs for 
usability engineering can be found in Mayhew and 
Bias (1994). 

Design-rationale approaches (Buckingham Shum, 
1996) face the same problems of profitability as 
user-centred development processes. As payoffs 
are not immediately identifiable, developers and 
designers of software products are still reluctant to 
either try it or use it in a systematic way. 

Design rationale follows three main goals. 

1. Provide means (notations, tools, techniques, 
etc.) for the systematic exploration of design 
alternatives throughout the development pro- 
cess 
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2. Provide means to support argumentation when 
design choices are to be made 

3. Provide means to keep track of these design 
choices in order to be able to justify when 
choices have been made 

Such approaches increase the production of ra- 
tional designs, that is, where trust in designers’ 
capabilities can be traced back. One of the main 
arguments for following a rationale-based-design 
development process is that such processes in- 
crease the overall quality of systems. However, 
when it comes to putting design rationale into prac- 
tice, that is, within development teams and real 
projects, more concrete arguments around costs and 
benefits have to be provided. 

Figure 2 reuses the same argumentation process 
as the one used in Figure 1 for justifying the profit- 
ability of user-centred approaches. While user- 
centred approaches find their profitability when 
costs related to the actual use of the system are 
taken into account, design rationale finds its profit- 
ability when costs are taken into account amongst 
several projects. Figure 2 is made up of three bars, 
each representing a different project. The grey parts 
of the bars represent the development cost for the 
project. The black parts represent the additional 
costs for using a development process following a 
design-rationale approach. As shown, the lengths of 
the black parts of the bars remain the same, repre- 
senting the fact that costs related to design-rationale 
activities remain the same across projects. Accord- 
ing to the projects we have been working on, it is 
clearly not true for the first project in a given domain. 



Figure 2: Profitability related to design rationale 
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Indeed, the basic elements of design rationale have 
to be gathered first, such as the pertinent criteria and 
factors according to the domain and the character- 
istics of the project. The other interesting aspect of 
Figure 2 is the fact that the cost of the development 
of the project decreases according to the number of 
projects as reuse from design rationale increases 
accordingly. The white parts of the bars represent 
the increasing savings due to the reuse of informa- 
tion by using the design-rationale approach of previ- 
ous projects. This amount is likely to follow a loga- 
rithmic curve, that is, to reach a certain level where 
the cost decrease will reduce. However, our expe- 
rience of design-rationale approaches is not wide 
enough to give more precise information about this. 

Development processes in the field of safety- 
critical systems (such as RTCA/DO-178B, 1992) 
explicitly require the use of methods and techniques 
for systematically exploring design options and for 
increasing the traceability of design decisions. DO- 
178B is a document describing a design process. 
However, even though such development processes 
are widely used in the aeronautical domain, the 
design-rationale part remains superficially addressed. 

We believe that this underexploitation of such a 
critical aspect of the design process lies in two main 
points. 

• There is no integration of current practice in 
user-centred design processes and design ra- 
tionale. For instance, no design-rationale nota- 
tion or tool relates to task modeling, scenarios, 
dialogue models, usability heuristics, and so 
forth that are at the core of the discipline. 

• There is no adequate tool to support a demand- 
ing activity such as design rationale that is 
heavily based on information storage and re- 
trieval as well as on reuse. In software engi- 
neering, similar activities are supported by case 
tools that are recognised as critical elements 
for the effective use of notations. 

The next section presents a set of design-ratio- 
nale notations and a tool, based on the QOC (ques- 
tions, options, criteria) notation (MacLean, Young, 
Bellotti, & Moran, 1991) that is dedicated to the 
rationale design of interactive systems. 
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BACKGROUND 



Figure 3. Schematic view of a QOC diagram 



In this section, we briefly describe the design-ratio- 
nale notation, and then we detail the QOC notation. 

IBIS (issue-based information system; Kunz & 
Rittel, 1970) was designed to capture relevant infor- 
mation with low-cost and fast information retrieval. 
IBIS has some scalability problems due to the 
nonhierarchical organisation of the diagram. DRL 
(decision representation language) was conceived by 
Lee (1991). The goal was to provide a notation for 
tracing the process that would lead the designers to 
choose an alternative. DRL, based on a strict vo- 
cabulary, captures more information than necessary 
for design rationale. Diagrams quickly become in- 
comprehensible. 

QOC is a semiformal notation (see Figure 3) 
introduced by MacLean et al. (1996) that has two 
main advantages: It is easy to understand but still 
useful in terms of structuring. QOC was designed for 
reuse. 

The easy-to-understand characteristic is critical 
for design rationale as the models built using the 
notation must be understandable by the various ac- 
tors involved in the development process (i.e., de- 
signers, software engineers, human-factors experts, 
and so forth). A QOC diagram is structured in three 
columns, one for each element (questions, options, 
criteria), and features links between columns’ ele- 
ments. For each question that may occur during the 
development process, the actor may relate one or 
more relevant options (i.e., candidate design solu- 
tions), and to these options, criteria are related (by 
means of a line) in order to represent the fact that a 
given option has an impact (if beneficial, the line is 
thick; if not, the line is dashed). In QOC, an option 
may lead to another question (as, for instance, Ques- 
tion 2 in Figure 3), thus explicitly showing links 
between diagrams. In addition, arguments can be 
attached to options in order to describe further detail: 
either the content or the underlying rationale for 
representing the option. 



HCI-RELATED EXTENSIONS 



Argument 1 Argument 2 




Criterion 4 
Criterion 5 



Adding Task Models to QOC Diagrams 

This extension (Lacaze, Palanque, & Navarre, 2002) 
aims at integrating task models in QOC diagrams in 
order to be able (through scenarios extracted from 
the task models) to assess the respective perfor- 
mance of the various options under consideration 
(Lacaze, Palanque, Navarre, & Bastide, 2002). 

This extension is critical for an efficient, 
rationalised development process as it takes into 
account task analysis and modeling as well as 
scenarios that are important and expensive activi- 
ties in user-centred design. 

Adding Factors to QOC Diagrams 

This extension has been introduced by Farenc and 
Palanque (1999). In original QOC, there is no way 
to store and thus to argue with respect to user 
requirements. However, it is clear that in user- 
centred development, users take an important role 
in the decisions made. In these extensions, user 
requirements are expressed as a set of factors. The 
factors correspond to high-level requirements such 
as learnability , safety, and so forth, and the satisfac- 
tion of those factors can be checked against their 
corresponding criteria. The early identification of 
factors has been based on McCall, Richards, and 
Walters’ (1977) classification that is widely used in 
software engineering. The elements of the classifi- 
cation are the following. 



In this section, we detail the extensions that we • Quality factors: requirements expressed by 
propose. These extensions integrate HCI principles the clients and/or users 

into the notation. 
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• Quality criteria: Characteristics of the prod- 
uct (technical point of view) 

• Metrics: The allowing of the actual valuation 
of a criterion 

Figure 4 shows some metrics; however, factors 
would be directly related to criteria. 

TOOL SUPPORT 

We have developed a case tool called DREAM 
(Design Rationale Environment for Argumentation 
and Modelling) supporting the previous extensions 
as well as several others that have not been pre- 
sented here due to space restrictions. A snapshot of 
this tool is presented in Figure 5. The tool can be 
accessed at http://liihs.irit.fr/dream/. DREAM pro- 
poses several visualisations for the same diagram. In 
the bottom right-hand corner of Figure 5, the dia- 
gram is displayed as a bifocal tree (Cava, Luzzardi, 
& Freitas, 2002). 

FUTURE TRENDS 



Figure 5. Snapshot of DREAM tool 




CONCLUSION 

The work presented here offers a plea for more 
extensive use of design-rationale approaches for the 
design of interactive systems. Notations closely 
related to current practice in the field of F1CI and 
adequate tools are the ways to achieve this goal. 



Design rationale clearly has to be integrated in the 
design process and merged with UMF (Booch, 
Rumbaugh, & Jacobson, 1999). This will improve 
the quality of interactive systems (Newman & 
Marshall, 1991). Interactive systems will be de- 
signed and built rationally, and they will not depend 
solely on the designers' beliefs. 



Figure 4. Scenario and task-model extensions to 
QOC 



Task Model 1 




REFERENCES 

Beck, K. (1999). Extreme programming explained: 
Embrace change. Boston: Addison- Wesley 
Fongman Publishing Co. 

Booch, G., Rumbaugh, J., & Jacobson, I. (1999). 
The unified modeling language user guide. Bos- 
ton: Addison Wesley Fongman Publishing Co. 

Buckingham Shum, S. (1996). Analysing the usabil- 
ity of a design rationale notation. In T. P. Moran & 
J. Carroll (Eds.), Design rationale: Concepts, 
techniques, and use (pp. 185-215). Mahwah, NJ: 
Fawrence Erlbaum Associates. 

Cava, R. A., Fuzzardi, P. R. G., & Freitas, C. M. D. 
S. (2002). The bifocal tree: A technique for the 
visualization of hierarchical information structures. 
IHC 2002: 5 th Workshop on Human Factors in 
Computer Systems (pp. 303-313). 



157 



Design Rationale for Increasing Profitability of Interactive Systems Development 



Farenc, C., &Palanque, P. (1999). Exploiting design 
rationale notations for a rationalised design of inter- 
active applications. IHM’99: 11 th French-Speak- 
ing Conference on Human Computer-Interac- 
tion, (pp. 22-26). 

Freitas, C., Cava, R., Winckler, M., & Palanque, P. 
(2003). Synergistic use of visualisation technique 
and Web navigation model for information space 
exploration. Proceedings of Human-Computer 
Interaction International (pp. 1091-1095). 

Gruber, T. R., & Russell, D. M. (1990). Design 
knowledge and design rationale: A framework 
for representation, capture, and use (Tech. Rep. 
No. KSL 90-45). Stanford University, Knowledge 
Systems Laboratory, California. 

Kunz, W., & Rittel, H. (1970). Issues as elements of 
information systems. Berkeley: University of Cali- 
fornia. 

Lacaze, X., Palanque, P., & Navarre, D. (2002). 
Evaluation de performance et modeles de taches 
comme support a la conception rationnelle des 
systemes interactifs. IHM’02: 14 ,h French-Speak- 
ing Conference on Human Computer-Interac- 
tion, (pp. 17-24). 

Lacaze, X., Palanque, P., Navarre, D., & Bastide, 
R. (2002). Performance evaluation as a tool for 
quantitative assessment of complexity of interactive 
systems. In Lecture notes in computer science: 
Vol. 2545. DSV-IS’02 9 th Workshop on Design 
Specification and Verification of Interactive Sys- 
tems (pp. 208-222). Springer. 

Lee, J. (1991). Extending the Potts and Bruns model 
for recording design rationale. Proceedings of the 
13 th International Conference on Software Engi- 
neering (pp. 114-125). 

MacLean, A., Young, R. M., Bellotti, V., & Moran, 
T. (1991a). Questions, options and criteria: Ele- 
ments of design space analysis. Journal on Human 
Computer Interaction, 6(3-4), 201-250. 

MacLean, A., Young, R. M., Bellotti, V. M. E., & 
Moran, T. P. (1996). Questions, options, and criteria: 
Elements of design space analysis. In T. P. Moran 
& J. M. Carroll (Eds.) Design rationale: Con- 



cepts, techniques, and use. Mahwah, NJ: Laurence 
Erlbaum Associates. 

Martin, J. (1991). Rapid application development. 
UK: Macmillan Publishing Co. 

Mayhew, D., & Bias, R. (1994). Cost-justifying 
usability. Pages Academic Press. 

McCall, J., Richards, P., & Walters, G. (1977). 
Factors in software quality (Vol. 3, RADC-TR- 
77-369). Rome Air Development Center (RADC). 

Newman, S. E., & Marshall, C. C. (1991). Pushing 
Toulmin too far: Learning from an argument 
representation scheme (Tech. Rep. No. SSL-92- 
45). Xerox PARC. 

Norman, D. A., & Draper, S. W. (1986). User- 
centred system design: New perspectives on hu- 
man computer interaction. Hillsdale, NJ: Lawrence 
Erlbaum Associates. 

Paterno, F. (2001). Task models in interactive soft- 
ware systems. In S. K. Chang (Ed.), Handbook of 
software engineering and knowledge engineer- 
ing (pp. 355-370). London: World Scientific Publish- 
ing Co. 

RTCA/DO-178B. (1992). Software considerations 
in airborne systems and equipment certification. 
RTCA Inc. 

Vredenburg, K., Isensee, S., & Righi, C. (2001). 
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KEY TERMS 

Bifocal Tree: “The Bifocal Tree (Cava et al., 
2002) is a visual representation for displaying tree 
structures based on a node-edge diagram. The tech- 
nique displays a single tree structure as a 
focus+context visualisation. It provides detailed and 
contextual views of a selected sub-tree” (Freitas, 
Cava, Winckler, & Palanque, 2003, p. 1093). 

Criterion: The expressed characteristics of an 
interactive system. The criterion must be valuable, 
and it denies or supports options. 
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Design Rationale: “A design rationale is an 
explanation of how and why an artefact, or some 
portion of it, is designed the way it is. A design 
rationale is a description of the reasoning justifying 
the resulting design — how structures achieve func- 
tions, why particular structures are chosen over 
alternatives, what behaviour is expected under what 
operating conditions. In short, a design rationale 
explains the ‘why’ of a design by describing what the 
artefact is, what it is supposed to do, and how it got 
to be designed that way” (Gruber & Russell, 1990, 
p. 3). 

DREAM (Design Rationale Environment for 
Argumentation and Modelling): DREAM is a 
tool dedicated to design-rationale capture by the 
way of an extended QOC notation. 

Factor: The expressed requirement of the cus- 
tomer. 



QOC: QOC is a semiformal notation dedicated 
to design rationale. Problems are spelled out in terms 
of questions, options to solve the questions, and 
criteria that valuate each option. See MacLean et al. 
(1991). 

RAD: A software-development process that al- 
lows usable systems to be built in as little as 90 to 1 20 
days, often with some compromises. 

Task Model: “Task models describe how ac- 
tivities can be performed to reach the users’ goals 
when interacting with the application considered” 
(Paterno, 2001, p. 359). 
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INTRODUCTION 

The original idea of a portable computer has been 
credited to Alan Kay of the Xerox Palo Alto Re- 
search Center, who suggested this idea in the 1970s 
(Kay, 1972a, 1972b; Kay & Goldberg, 1977). He 
envisioned a notebook-sized, portable computer 
named the Dynabook that could be used for all of the 
user’s information needs and used wireless network 
capabilities for connectivity. 

BACKGROUND 

The first actual portable laptop computers appeared 
in 1979 (e.g., the Grid Compass Computer designed 
in 1979 by William Moggridge for Grid Systems 
Corporation [Stanford University, 2003]). The Grid 
Compass was one-fifth the weight of any model 
equivalent in performance and was used by NASA 
on the space shuttle program in the early 1980s. 
Portable computers continued to develop in the 
1980s onwards, and most weighed about 5kg without 
any peripherals. 

In 1984, Apple Computer introduced its Apple lie 
model, a true notebook-sized computer weighing 
about 5kg without a monitor (Snell, 2004). The Apple 
lie had an optional LCD panel monitor that made it 
genuinely portable and was, therefore, highly suc- 
cessful. 

In 1986, IBM introduced its IBM Convertible PC 
with 256KB of memory, which was also a commer- 
cial success (Cringely, 1998). For many, this is 
considered the first true laptop (mainly due to its 
clamshell design) that soon was copied by other 
manufacturers such as Toshiba, who also was suc- 
cessful with IBM laptop clones (Abetti, 1997). These 
devices retained the A4 size footprint and full 
QWERTY keyboards and weighed between 3 and 4 
kg. Following these innovations, Tablet PCs with a 



flat A4 footprint and a pen-based interface began to 
emerge in the 1990s. 

There were several devices in the 1970s that 
explored the Tablet, but in 1989, the Grid Systems 
GRiDPad was released, which was the world’s first 
IBM PC Compatible Tablet PC that featured hand- 
writing recognition as well as a pen-based point-and- 
select system. In 1 992, Microsoft released Microsoft 
Windows for Pen Computing, which had an Applica- 
tion Programming Interface (API) that developers 
could use to create pen-enabled applications. Focus- 
ing specifically on devices that use the pen as the 
primary input device, this interface has been most 
successfully adopted in the new breed of small 
highly portable personal digital assistants. 

In 1984 David Potter and his partners at PSION 
launched the PSION Organiser that retailed for just 
under £100 (Troni & Fowber, 2001). It was a 
battery-powered, 14cm x 9cm block-shaped unit 
with an alphabetic keyboard and small FCD screen, 
with 2K of RAM, 4KB of applications in ROM, and 
a free 8KB data card (which had to be reformatted 
using ultraviolet light for reuse). Compared to the 
much larger notebook computers of the time, it was 
a revolutionary device, but because of its more 
limited screen size and memory, it fulfilled a differ- 
ent niche in the market and began to be used for 
personal information management and stock inven- 
tory purposes (with a plug-in barcode reader). 

In the late 1980s and 1990s, PSION continued to 
develop commercially successful small computing 
devices incorporating a larger FCD screen and a 
new fully multi-tasking graphical user interface (even 
before Microsoft had Windows up and running). 
These small devices were truly handheld. The di- 
mensions of the PSION 3c (launched in 1991) were 
1 65mmx85mmx22 mm, with a 480 x 1 60-pixel FCD 
screen; the device weighed less than 400g. A small 
keyboard and innovative touch pad provided control 
of the cursor, and graphical icons could be selected 
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to start applications/functions and select items from 
menus. The small keyboard proved difficult to use, 
however, and the following 5c model in 1 997 used an 
innovative foldout miniature QWERTY keyboard. 
These genuinely handheld devices with their inter- 
face innovations and ability to synchronize data with 
a host personal computer made the PSION models 
particularly successful and firmly established the 
personal digital assistant (PDA) as a portable com- 
puting tool for professionals. 

ALTERNATIVE INTERFACES AND 
THE INTEGRATION OF MULTIMEDIA 

The limitations of keyboard-based data entry for 
handheld devices had been recognized, and follow- 
ing PSION’s lead, Apple Computers introduced the 
Newton Message Pad in 1993. This device was the 
first to incorporate a touch-sensitive screen with a 
pen-based graphical interface and handwriting-rec- 
ognition software. Although moderately successful, 
the device’s handwriting recognition proved slow 
and unreliable, and in 1998, Apple discontinued its 
PDA development (Linzmayer, 1999). However, 
the PDA market now was becoming based firmly 
upon devices using pen-based handwriting recogni- 
tion for text entry, and in mid-2001, PSION, with 
dwindling sales and difficulties with business part- 
nerships, ceased trading. US Robotics launched the 
Palm Pilot in 1 996 , using its simple Graffiti handwrit- 
ing recognition system, and Compaq released the 
iPAQ in 1997, incorporating the new Microsoft 
Windows CE/Pocket PC operating system with the 
first PDA color screen (Wallich, 2002). 

Microsoft’s relatively late entry into this market 
reflected the considerable research and develop- 
ment it undertook in developing a user-friendly pocket 
PC handwriting recognition interface. This remains 
a highly competitive field, and in November 2002, 
PalmSource (the new company owning the Palm 
Operating System) replaced the Graffiti system with 
Computer Intelligence Corporation’s JOT as the 
standard and only handwriting software on all new 
Palm Powered devices. Computer Intelligence Cor- 
poration (CIC) was founded in conjunction with 
Stanford Research Institute, based on research con- 
ducted by SRI on proprietary pattern recognition 
technologies (CIC, 1999). The original Graffitti sys- 



tem relied on the user learning a series of special 
characters, which, though simple, was irksome to 
many users. The CIC JOT and Microsoft Pocket PC 
systems have been developed to avoid the use of 
special symbols or characters and to allow the user 
to input more naturally by using standard upper and 
lower case printed letters. Both systems also recog- 
nize most of the original Palm Graffiti-based special 
characters. 

The arrival of the Short Messaging Service (SMS), 
otherwise known as text messaging, for cellular 
phones in the late 1990s led several PDA manufac- 
turers to adopt an alternative Thumb Board inter- 
face for their PDAs. SMS allows an individual to 
send short text and numeric messages (up to 160 
characters) to and from digital cell phones and public 
SMS messaging gateways on the Internet. With the 
widespread adoption of SMS by the younger genera- 
tion, thumb-based text entry (using only one thumb to 
input data on cell phone keypads) became popular 
(Karuturi, 2003). Abbreviations such as “C U L8er” 
for “see you later” and emoticons or smileys to 
reduce the terseness of the medium and give short- 
hand emotional indicators developed. The rapid com- 
mercial success of this input interface inspired the 
implementation of Thumb Board keyboards on some 
PDAs (i.e., the Palm Treo 600) for text interface. 
Clip-on Thumb Board input accessories also have 
been developed for a range of PDAs. 

Current developments in PDA-based interfaces 
are exploring the use of multimedia, voice recogni- 
tion, and wireless connectivity. The expansion of 
memory capabilities and processor speeds for PDAs 
has enabled audio recording, digital music storage/ 
playback, and now digital image and video record- 
ing/playback to be integrated into these devices. 
This and the integration of wireless network and 
cellular phone technologies have expanded their 
utility considerably. 

Audio input has become very attractive to the 
mobile computer user. Audio is attractive for mobile 
applications, because it can be used when the user’ s 
hands and eyes are occupied. Also, as speech does 
not require a display, it can be used in conditions of 
low screen visibility, and it may consume less power 
than text-based input in the PDA. The latest PDA 
interface innovations include voice command and 
dictation recognition (voice to text), voice dialing, 
image-based dialing (for cell phone use, where the 



161 



The Development of the Personal Digital Assistant (PDA) Interface 



user states a name or selects an image to initiate a 
call), audio memo recording, and multimedia messag- 
ing (MMS). Several devices (e.g., the new Carrier 
Technologies i-mate) also incorporate a digital cam- 
era. 

Wireless connectivity has enabled Internet con- 
nectivity, enabling users to access e-mail, text/graphi- 
cal messaging services (SMS and MMS), and the 
Web remotely (Kopp, 1998). These developments 
gradually are expanding the PDA’s functionality into 
a true multi-purpose tool. 

FUTURE TRENDS 

Coding PDA applications to recognize handwriting 
and speech and to incorporate multimedia requires 
additional code beyond traditionally coded interfaces. 
PDA application design and development need to 
support this functionality for the future. 

One of the key limitations of PDA interfaces 
remains the output display screen size and resolution. 
This arguably remains a barrier to their uptake as the 
definitive mobile computing device. As input tech- 
nologies improve and as voice and handwriting rec- 
ognition come of age, attention to the display capabili- 
ties of these devices will need to be addressed before 
their full potential can be realized. The display size 
and resolution already is being pushed to the limits by 
the latest PDA applications such as global positioning 
system (GPS) integration with moving map software 
(deHerra, 2003; Louderback, 2004). 

Data and device security are key areas for highly 
portable networked PDAs, and the first viruses for 
PDAs have started to emerge (BitDefender, 2004). 
As multimedia interfaces develop, the specific secu- 
rity issues that they entail (i.e. individual voice recog- 
nition, prevention of data corruption of new file 
formats) also will need to be addressed. 

CONCLUSION 

Since the early models, manufacturers have contin- 
ued to introduce smaller and improved portable com- 
puters, culminating in the latest generation of power- 
ful handheld PDAs offering fast (400 MHz and 
faster) processors with considerable memory (64MB 



of ROM and 1GB of RAM or more). This area of 
technological development remains highly competi- 
tive, and by necessity, the user interface for these 
devices has developed to fulfill the portable design 
brief, including the use of pen- and voice-based data 
input, collapsible LCD displays, wireless network 
connectivity, and now cell phone integration. Mod- 
ern PDAs are much more sophisticated, lightweight 
devices and are arguably much closer to Kay’s 
original vision of mobile computing than the current 
laptop or tablet computers and possibly have the 
potential to replace this format with future interface 
developments. Indeed, if the interface issues are 
addressed successfully, then it is probable that 
these devices will outsell PCs in the future and 
become the major computing platform for personal 
use. 
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KEY TERMS 

Audio Memo: A recorded audio message of 
speech. Speech is digitally recorded via a built-in or 
attached microphone and stored as a digital audio file 
on the storage media of the PDA. 



Emoticon: Text (ASCI) characters used to indi- 
cate an emotional state in electronic correspon- 
dence. Emoticons or smileys, as they are also called, 
represent emotional shorthand. For example :-) rep- 
resents a smile or happiness. 

Laptop: A portable personal computer small 
enough to use on your lap with a QWERTY key- 
board and display screen. It usually has an A4-sized 
footprint in a clamshell configuration and may incor- 
porate a variety of peripheral devices (e.g., trackball, 
CD-ROM, wireless network card, etc.). 

Media Player: A device or software application 
designed to play a variety of digital communications 
media such as compressed audio files (e.g., MPEG 
MP3 files), digital video files, and other digital media 
formats. 

Multimedia: Communications media that com- 
bine multiple formats such as text, graphics, sound, 
and video (e.g., a video incorporating sound and 
subtitles or with text attached that is concurrently 
displayed). 

Multimedia Messaging Service (MMS): A 

cellular phone service allowing the transmission of 
multiple media in a single message. As such, it can 
be seen as an evolution of SMS with MMS support- 
ing the transmission of text, pictures, audio, and 
video. 

Palmtop : A portable personal computer that can 
be operated comfortably while held in one hand. 
These devices usually support a QWERTY key- 
board for data input with a small display screen in an 
A5-sized footprint. 

PDA: Personal Digital Assistant. A small 
handheld computing device with data input and 
display facilities and a range of software applica- 
tions. Small keyboards and pen-based input systems 
are commonly used for user input. 

Pen Computing: A computer that uses an elec- 
tronic pen (or stylus) rather than a keyboard for data 
input. Pen-based computers often support handwrit- 
ing or voice recognition so that users can write on the 
screen or vocalize commands/dictate instead of 
typing with a keyboard. Many pen computers are 
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handheld devices. It also is known as pen-based 
computing. 

Personal Information Manager (PIM): A 

software application (i.e., Microsoft Outlook) that 
provides multiple ways to log and organize personal 
and business information such as contacts, events, 
tasks, appointments, and notes on a digital device. 

Smartphone: A term used for the combination 
of a mobile cellular telephone and PDA in one small 
portable device. These devices usually use a small 
thumb keyboard or an electronic pen (or stylus) and 
a touch-sensitive screen for data input. 

SMS: Short Message Service. A text message 
service that enables users to send short messages 
(160 characters) to other users and has the ability to 
send a message to multiple recipients. This is known 
as texting. It is a popular service among young 
people. There were 400 billion SMS messages sent 
worldwide in 2002 (GSM World, 2002). 



Synchronization: The harmonization of data on 
two (or more) digital devices so that both (all) 
contain the same data. Data commonly are synchro- 
nized on the basis of the date they were last altered, 
with synchronization software facilitating the pro- 
cess and preventing duplication or loss of data. 

Tablet PC: A newer type of format of personal 
computers. It provides all the power of a laptop PC 
but without a keyboard for text entry. Tablet PCs 
use pen-based input and handwriting and voice 
recognition technologies as the main forms of data 
entry, and they commonly have an A4-size footprint. 

Wireless Connectivity: The communication of 
digital devices between one another using data 
transmission by radio waves. A variety of standards 
for wireless data transmission now exist, established 
by the Institute of Electrical and Electronics Engi- 
neers (IEEE) and including Wi-fi (802.11) and 
Bluetooth (802.15). 
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INTRODUCTION 

There are various development methodologies that 
are used in developing ISs, some more conventional 
than others. On the conventional side, there are 
two major approaches to systems development meth- 
odologies that are used to develop IS applications: 
the traditional systems development methodology 
and the object-oriented (OO) development ap- 
proach. The proponents of HCI and interaction 
design propose life cycle models with a stronger 
user focus than that employed in the conventional 
approaches. Before the researcher looks at these 
approaches, he or she needs to ponder about the 
method of comparing and assessing the various meth- 
odologies. There are always inherent problems in 
comparing various development methodologies 
(The Object Agency, 1993). 

It is, in many instances, difficult to repeat the 
results of a methodology comparison with any accu- 
racy. Since few (if any) of the comparisons cite page 
references indicating where a particular methodology 
comparison item (e.g., a term, concept, or example) 



can be found in the methodology under review, it is 
difficult, if not impossible, to verify the accuracy of 
these methodology comparisons. The researchers did 
not compare the methodologies step-by-step, but 
rather in terms of whether and when they address the 
human element. Researchers have to acknowledge 
that methodologies are always in a state of flux. In 
theory, one thing happens, and in practice the method- 
ologies are modified to suit individual business needs. 

BACKGROUND 

Development Methodologies 

This section gives an overview of the three primary 
groups of development methodologies and the major 
phases/processes involved. The aim of all these meth- 
odologies is to design effective and efficient ISs. But 
how effective are they when the wider environment 
is considered? A more contemporary approach is that 
the information system is open to the world and all 
stakeholders can interact with it (see Figure 1). 



Figure 1. Contemporary approach to business 
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Traditional Systems Development 
Approaches 

Under the traditional development approaches, there 
are various methodologies. All of these approaches 
have the following phases in common: Planning 
(why build the system?): Identifying business value, 
analysing feasibility, developing a work plan, staff- 
ing the project, and controlling and directing the 
project; Analysis (who, what, when, where will the 
system be?): Analysis, information gathering, pro- 
cess modelling and data modelling; Design (how will 
the system work?): Physical design, architecture 
design, interface design, database and file design 
and program design; Implementation (system deliv- 
ery): Construction and installation of system. We 
will look at the Dennis and Wixom Approach (2000). 

00 Methodologies 

Although diverse in approach, most object-oriented 
development methodologies follow a defined system 
development life cycle, and the various phases are 
intrinsically equivalent for all the approaches, typi- 
cally proceeding as follows (Schach, 2002): require- 
ments phase; OO analysis phase (determining what 
the product is to do) and extracting the objects; 00 
(detailed) design phase; 00 programming phase 
(implementing in appropriate 00 programming lan- 
guage); integration phase; maintenance phase; and 
finally retirement. 00 stages are not really very 
different from the traditional system development 
approaches mentioned previously. 

The 00 development approach in general lends 
itself to the development of more effective user 
interfaces because of the iterative design process, 
although this process does not seem to be effectively 
managed and guidelines for doing so are often 
absent. The authors analyzed three 00 methodolo- 
gies: The Rumbaugh, Blaha, Premerlani, Eddy, and 
Lorensen (1991), Coad and Yourdan (1991), and 
IBM (1999) approaches and their relationship to the 
aspects illustrated in Figure 1 . 

HCI-Focused Life Cycle Methodologies 

The HCI proponents aim to focus more on the 
human and end-user aspects. There are four types 



of users for most computer systems: These are 
naive, novice, skilled, and expert users. With the 
widespread introduction of information and commu- 
nication technology into our everyday lives, most 
computer users today have limited computer experi- 
ence, but are expected to use such systems. 

Usability is a measurable characteristic of a 
product user interface that is present to a greater or 
lesser degree. One broad dimension of usability is 
how easy for novice and casual users to learn the 
user interface (Mayhew, 1999). Another usability 
dimension is how easy for frequent and proficient 
users to use the user interface (efficiency, flexibility, 
powerfulness, etc.) after they have mastered the 
initial learning of the interface (Mayhew, 1999). 

Williges, Williges, and Elkerton (1987) have pro- 
duced an alternative model of systems development 
to rectify the problems in the traditional software 
development models. In their model, interface de- 
sign drives the whole process. Preece, Rogers, and 
Sharp (2002) suggest a simple life cycle model, 
called the Interaction Design Model, consisting of 
identifying needs/establishing requirements; evalu- 
ating; building an interactive version; and 
(re)designing. Other life cycle models that focus on 
HCI aspects include the Star Model of Hartson and 
Hix (1989), the Usability Engineering Life Cycle of 
Mayhew (1999), Organizational Requirements Defi- 
nition for Information Technology (ORDIT) method, 
Effective Technical and Human Implementation of 
Computer-based Systems (ETHICS), visual 
prototyping and Hackos and Redish’ s model (1998). 
These methods also introduce various strategies for 
the development of effective user interfaces. 

ASSESSING THE METHODOLOGIES 

One of the problems with the traditional model for 
software development and the 00 approaches is 
that they do not, in general, clearly identify a role for 
HCI in systems development. User interface con- 
cerns are “mixed in” with wider development activi- 
ties. This may result in one of two problems: either 
HCI is ignored, or it is relegated to the later stages 
of design as an afterthought. In either case, the 
consequences can be disastrous. If HCI is ignored, 
then there is a good chance that problems will occur 
in the testing and maintenance stages. If HCI is 
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Table 1. Methodology matrix 



Approach 


UI 

Component 


Internal 

Users 


Customers 


Suppliers 


IT 

Department 


Government 


The Dennis and Wixom (2000) Approach 














Planning 


no 


yes 


not involved 


not involved 


actively part 
of 


not involved 


Analysis 


no 


yes 


not involved 


not involved 


actively part 
of 


not involved 


Design 


yes 


yes 


not involved 


not involved 


actively part 
of 


not involved 


Implementation 


yes 


yes 


not involved 


not involved 


actively part 
of 


not involved 


The Rumbaugh et al. (1991) model 




















Analysis phase 


attempts 


attempts 


not part of 


not part of 


actively part 
of 


not part of 


System design 


no 


not involved 


not part of 


not part of 


actively part 
of 


not part of 


Object design 


no 


not involved 


not part of 


not part of 


actively part 
of 


not part of 


The Coad and Your dan (1991) Model 














Analysis 


no 


attempts 


not part of 


not part of 


actively part 
of 


not part of 


Design 


yes 


attempts 


not part of 


not part of 


actively part 
of 


not part of 


The IBM (1 999) model 














OO design phase 


yes 


attempts 


not part of 


not part of 


actively part 
of 


not part of 


The design the business model phase 


no 


not part of 


not part of 


not part of 


not part of 


not part of 


Williges et al. (1987) 














Initial Design 


yes 


yes 


attempts 


attempts 


actively part 
of 


attempts 


Formative Evaluation 


yes 


yes 


attempts 


attempts 


actively part 
of 


attempts 


Summative Evaluation 


yes 


yes 


attempts 


attempts 


actively part 
of 


attempts 


Hackos and Redish (1998) Approach 














Systems development 


no 


attempts 


attempts 


attempts 


actively part 
of 


attempts 


Interface design 


yes 


yes 


attempts 


attempts 


actively part 
of 


attempts 


Design and implementation 


yes 


yes 


attempts 


no 


actively part 
of 


no 


Testing phase 


yes 


yes 


attempts 


no 


actively part 
of 


no 



relegated to the later stages in the development cycle 
then it may prove very expensive to “massage” 
application’s functionality into a form that can be 
readily accessed by the end-user and other stake- 
holders. 

If we examine the methodology matrix (Table 1) 
in more detail in respect of all the components 
reflected in Figure 1, we find the following with 
regard to the structured development and 00 devel- 
opment methodologies: 



a. In the Dennis and Wixom (2000) approach, 
interface design is only considered in the later 
stages of development. The components of 
Figure 1 only partially map onto this approach, 
with no reference to the customers, suppliers, 
the IT department specifically, or the govern- 
mental issues. 

b. IntheRumbaughetal. (1991) approach, there 
is no special consideration given to the design 
of the user interface or any of the other 
components reflected in Figure 1 . 
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c. In the Coad & Yourdon (1991) model, the 
human interaction component includes the ac- 
tual displays and inputs needed for effective 
human-computer interaction. This model is a 
partial fit onto Figure 1 . While the internal users 
of the system are well catered for, the other 
stakeholders are not actively involved in the 
process at all. 

d. The IBM (1999) model considers the users in 
the development of the system, but Figure 1 is 
still only a partial fit onto this model. The 
internal users are considered in the develop- 
ment of the system, but the external users and 
other stakeholders are sidelined. 

It is clear from this that there are several missing 
components in all these software developments life 
cycles (SDLCs). The Coad and Yourdan (1991) 
approach explicitly takes into account the HCI as- 
pect and tends to ignore the other aspects of Figure 
1. The same applies to the IBM (1999) approach, but 
the process is much shorter. The Rumbaugh et al. 
(1991) approach is very detailed but still ignores the 
issue of direct mapping to the final user interface 
application. Although in this approach, use case 
scenarios are actively employed and users get in- 
volved in systems design, it does not map directly 
onto the system’s user interface design. 

The root cause of this poor communication is that 
all the conventional development methodologies (in- 
cluding the traditional development approach and 
the 00 approach) do not devote adequate attention 
to the human aspect of systems development. Many 
researchers have proposed ways of improving the 
systems’ interface, but most of this has not been 
integrated into the development techniques. The 
researchers’ findings are a confirmation of the work 
of Monarchi and Puhr (1992). 

When we consider Table 1 with regard to the 
HCI-focused development models, we find that: (a) 
Williges et al. (1987) try to introduce the usability 
issues at a much earlier stage of the development 
process, but this model is not widely used; (b) The 
Hackos and Redish (1998) model seems to be the 
most comprehensive one we assessed. The short- 
coming of this model is, however, that it still ignores 
the outside stakeholders, unless the corporate objec- 
tives phase states categorically that the organization 
should give special consideration to the external 



users, such as customers, suppliers, and govern- 
ment. Hackos and Redish (1998) are silent on this 
issue, however, and do not elaborate on what they 
mean by “corporate objectives.” If the corporate 
objectives do include the outside stakeholders, this 
is the only model that we investigated that does this. 
In fact, if this is the case, the model maps onto 
Figure 1 . The usability engineering is done in par- 
allel with the systems development, and integrated 
throughout. 

FUTURE TRENDS 

There are major gaps of communication between 
the HCI and SE fields: the methods and vocabulary 
being used by each community are often foreign to 
the other community. As a result, product quality is 
not as high as it could be, and (avoidable) re-work is 
often necessary (Vanderdonckt & Harning, 2003). 
The development of mutually accepted standards 
and frameworks could narrow the communication 
gap between the HCI and SE fields. 

CONCLUSION 

The shortcomings of all the methodologies are there- 
fore related to the complexity of the wider environ- 
ment introduced by the issues highlighted in Figure 1 , 
and how these aspects should inform the system's 
development process. None of the development 
methodologies addressed the human component or 
the issue of other stakeholders sufficiently. Both the 
traditional SDLC and 00 approaches fall short on 
the issue of human aspects and stakeholder involve- 
ment. Although we expected the 00 approaches to 
fare better on these issues, the results given earlier 
clearly illustrate that these methodologies still have 
a long way to go in fully integrating environmental 
issues. Which one fared the best? Although the 
Williges et al. (1987) and Hackos and Redish (1998) 
approaches focusing on the user go a long way 
towards achieving this, several shortcomings can 
still be identified. There has to be a balanced ap- 
proach to systems development and HCI develop- 
ment components in the overall systems develop- 
ment process. 
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KEY TERMS 

Effective Technical & Human Implementa- 
tion of Computer-Based Systems (Ethics): A 

problem-solving methodology that has been devel- 
oped to assist the introduction of organizational 
systems incorporating new technology. It has as its 
principal objective the successful integration of com- 
pany objectives with the needs of employees and 
customers. 

Object-Oriented Design (OOD): A design 
method in which a system is modelled as a collection 
of cooperating objects and individual objects are 
treated as instances of a class within a class hierar- 
chy. Four stages can be discerned: identify the 
classes and objects, identify their semantics, identify 
their relationships, and specify class and object 
interfaces and implementation. Object-oriented de- 
sign is one of the stages of object-oriented program- 
ming. 

Ordit: Based on the notions of role, responsibil- 
ity, and conversations, making it possible to specify, 
analyze, and validate organizational and information 
systems supporting organizational change. The Ordit 
architecture can be used to express, explore, and 
reason about both the problem and the solution 
aspects in both the social and technical domains. 
From the simple building blocks and modelling lan- 
guage, a set of more complex and structured models 
and prefabrications can be constructed and rea- 
soned about. Alternative models are constructed, 
allowing the exploration of possible futures. 

System Development Life Cycle (SDLC): 

The process of understanding how an information 
system can support business needs, designing the 
system, building it, and delivering it to users. 

Usability: ISO 9241-11 standard definition for 
usability identifies three different aspects: (1) a 
specified set of users, (2) specified goals (tasks) 
which have to be measurable in terms of effective- 
ness, efficiency, and satisfaction, and (3) the context 
in which the activity is carried out. 

Usability Engineering: Provides structured 
methods for optimizing user interface design during 
product development. 
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INTRODUCTION 

With the ubiquitous availability of the Internet, the 
possibility of creating a centralized repository of an 
individual’s knowledge has become possible. Al- 
though, at present, there are many efforts to develop 
collaborative systems such as wikis (Leuf & 
Cunningham, 2002), Web logs or blogs (Winer, 
2002) and sharable content management systems 
(Wikipedia, 2004), an area that is overlooked is the 
development of a system that would manage per- 
sonal knowledge and information. For example, in an 
educational setting, it has been found that most 
lecturers customize content to suit their particular 
delivery styles. This article outlines a framework 
that uses Web technologies allowing the storage and 
management of personal information, the sharing of 
the content with other personal systems, and allows 
for annotations to be captured within context from 
people who visit the personal knowledge portfolio. 

BACKGROUND 

Continuing with the case of a lecturer, a vast amount 
of knowledge will be accumulated. This needs to be 
organised in a way so that it can be delivered in a 
variety of contexts. For example, a piece of knowl- 
edge about image resizing could be useful in the 
following domains: Web page design, databases, 
multimedia, and digital photography. But, this knowl- 
edge is continually changing as printed media are 
read or other people contribute with their comments 
or observations. Also, knowledge does not exist is 
one format, for example, an image can be used to 



illustrate a concept, a video can be used to show 
directions, and so forth. 

With the ability to manage a wide variety of 
digital formats, Web technologies have become an 
obvious way to organise an individual’ s knowledge. 
In the early 1990s, the Web was primarily made up 
of many static Web pages, where content and layout 
were hard coded into the actual page, so managing 
the ever-changing aspect of content was a time 
consuming task. In the late 1990s, database tech- 
nologies and scripting languages such as ASP (ac- 
tive server pages) and PFIP ( A recursive acronym 
for personal home page: hypertext pre-processor) 
emerged, and with these opportunities to develop 
new ways to capture, manage, and display individual 
and shared knowledge. 

But, what is knowledge? Generally, it is attached 
to an individual, and can be loosely defined as “what 
we know” or “what you have between the ears” 
(Goppold, 2003). Experience shows that as an 
individual’s content is personalized, as mentioned 
earlier, lecturers tend to customize content to suit 
their particular delivery style. So combining the 
notions that knowledge is something dynamic and is 
attached to an individual, that it may be enhanced 
and modified by other individuals, and that Web 
technologies can assist in its management, the “Vir- 
tual Me framework” has been developed. 

THE Virtual Me FRAMEWORK 

Figure 1 illustrates the concept of Virtual Me frame- 
work. The framework essentially is made up of three 
parts, the sniplet model which includes the multime- 
dia object model, and an annotation capability. 
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Figure 1. Rich picture of Me framework 




Sniplet Model 

In order to manage knowledge, the smallest useful 
and usable piece of content needs to be defined. 
Several prototypes indicated that a workable size for 
the content is that the one which can be represented 
by a single overhead projection. In order to refer to 
this, the term sniplet was coined (Verhaart, 2002). 
A sniplet needs to maintain context of the content 
with respect to everything else in the environment. 
Hence, they are initially classified in a backbone 
taxonomy (Guarino & Welty, 2002). The proposed 
Virtual Me framework allows for alternative tax- 
onomies to be created where the content can be used 
in other domains. 

In an electronic sense, an overhead can consist 
of many media elements, or digital assets, such as 
images, sounds, animations, and videos. But a digital 
asset also has other issues that need to be consid- 
ered. For example, an image displayed on a com- 
puter screen (at 75 dpi) produces poor quality results 
when produced in hardcopy (600 dpi and above). If 
accessibility issues are included, then the ability to 
represent a digital asset in multiple forms is required. 
For example, an image needs to be described in text, 
or alternatively in a sound file, to assist screen 
readers or for those visitors who have sight impair- 
ments. Finally, if the digital asset is to maintain its 
original context and ownership, some meta-data 
needs to be attached to it. There are many meta-data 



standards available, for example, Dublin Core (DCMI, 
2004) describes the object, and vCard (1996) de- 
scribes the creator. Extensible Markup Language 
(XML) is a portable way for data to be described on 
the Internet, by providing a structure where data is 
easily categorized. For example, an XML file could 
contain <author>Verhaart</author>. vCard is com- 
monly distributed using its own format, but in a paper 
for the World Wide Web Consortium W3C (Iannella, 
2001 ) described vCard in the Web formats XML and 
the Resource Definition framework (RDF) (Miller, 
Swick, & Brickley, 2004). Another important fea- 
ture is that the digital asset has some permanency, 
that is, in the future it can be located. On the Internet, 
the Uniform Resource Identifier (of which a Uni- 
form Resource Locator — URL — is a subset) is one 
way to give a resource an address. The Resource 
Definition Framework (RDF) takes this a stage 
further and also structures the resource using 
extensible Markup Language (XML) (W3C, 2004b). 
This is one of the cornerstones of the semantic Web 
where objects of the Web maintain some meaning. 

To cope with the available standards, a digital 
asset is described in the Virtual Me framework 
using a multimedia object (MMO) (Verhaart, 
Jamieson, & Kinshuk, 2004). An MMO essentially is 
a manifest of files that address the issues described 
previously. It is made up of the actual files that form 
the digital asset (multiple images, maybe a sound 
file) plus a special file that manages the meta-data 
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for the digital asset. In order to describe the meta- 
data and associated files, a description language 
(Media Vocabulary Markup Language - MVML) 
has been developed (Verhaart & Kinshuk, 2004), and 
is based on the standards mentioned previously: 
XML, RDF, Dublin Core and vCard. 

Annotations 

The ability to capture the many types of knowledge 
possessed by an individual is an integral part of 
creating a Virtual Me. Implicit knowledge is the 
simplest, since it can be easily written down, while 
tacit knowledge (Polanyi, 1958) is often impossible 
to be communicated in words and symbols (Davidson 
& Voss, 2002) and is much more difficult to capture. 
Tacit knowledge is often recalled when a context is 
presented, and in the physical world can be preceded 
with “Oh.. I remember...”. What about missing 
knowledge, that is, the knowledge “we don’t know 
we don’t know”? This is where visitors can assist. In 
the real world, this is made up of research or personal 
contacts filling in the blanks. The Virtual Me needs 
the ability for visitors to add their own annotations at 
various levels. Annotations can be classified at three 
levels: creator only (intended to operate like post-it 
type notes, although they should be viewable by site 
owner), creator and site owner (a direct communica- 
tion between them), and public (viewable to any 
visitor). 

Another technology gaining popularity to capture 
missing knowledge is the wiki. Cortese (2003) indi- 
cated that a wiki is Web collaboration software used 
by informal online groups, and is taking hold in the 
business realm. First coined in 1995 by Ward 
Cunningham, wiki means quick in Flawaiian 
(Jupitermedia Corporation, 2004). It is proposed that 
in the Virtual Me, a wiki type field is available to 
users for each sniplet (Figure 1). Here, a visitor 
would make a copy of the sniplet and be able to edit 
it. The site owner could then update the actual sniplet 
from the wiki to allow inclusion of missing knowl- 
edge. 

FUTURE TRENDS 

Development of learning resources is a costly and 
time consuming process. In order to facilitate the 



sharing and management of content, there is con- 
siderable research in the construction and stan- 
dardization of learning object repositories (McGreal 
& Roberts, 2001). IEEE Learning Technology Stan- 
dards Committee (IEEE LTSC) (1999) and Shar- 
able Content Object Reference Model (SCORM) 
(Advanced Distributed Learning, 2003) compliancy 
is considered to be the current standard although 
many countries are developing standards based on 
these, such as United Kingdom Learning Object 
Metadata (UK LOM) Core (Cetis, 2004) and the 
Canadian equivalent, Cancore (Friesen, Fisher, & 
Roberts, 2004). Content packagers such as RE- 
LOAD (2004) are evolving to allow manifests of 
files to be easily bundled together with learning 
object meta-data. A survey by Verhaart (2004) 
indicated that problems such as ownership have 
caused learning object repositories to become meta- 
data sites linked to personal content. This is a 
persuasive argument for the Virtual Me frame- 
work. 

The second major trend is that of the semantic 
Web. This is the vision of the modern Internet’s 
creator Tim Berners Lee, and the W3C (2004) is 
investing a significant part of its resources to mak- 
ing this vision a reality. The semantic Web is 
fundamentally about adding context and meaning to 
the Web content. Use of the structures suggested in 
this article will enable media and multimedia objects 
to retain and maintain their original context even 
when used in many different contexts. 

CONCLUSION 

This article presents a discussion where the man- 
agement of information is returned back to the 
individual, even though there is a prevailing trend to 
centralize information. Observations in an educa- 
tional setting indicate that many content deliverers: 
tutors and lecturers, prefer to customize material to 
reflect their personal styles. Further, personal own- 
ership is a powerful motivator in the management 
and evolution of information. 

The Virtual Me framework attempts to address 
this issue, allowing for the creation of a personal 
learning portfolio where visitors can contribute to 
the building of the content repository. 
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KEY TERMS 

Digital Asset: An electronic media element, 
that may be unstructured such as an image, audio, or 
video, or structured such as a document or presen- 
tation, usually with associated meta-data. 

Dublin Core: A set of 15 meta-data fields such 
as title and author, commonly used by library sys- 
tems to manage digital assets. (All fields are op- 
tional.) 

Learning Object: An artifact or group of arti- 
facts with learning objectives that can be used to 
increase our knowledge. 

Media Vocabulary Markup Language 
(MVML): A XML-based language that describes a 
media element. 



Meta-Data: Commonly data about data. For 
example, a digital asset has meta-data which would 
include the derived data (size, width) and annotated 
data (creator, description, context). 

Multimedia Object (MMO): A self-describ- 
ing manifest of files used to encapsulate an elec- 
tronic media element. Consists of media files con- 
forming to a defined naming standard and an asso- 
ciated MVML file. 

Resource Definition Framework (RDF): Part 
of the Semantic Web, and is a way to uniquely 
identify a resource whether electronic or not. 

Sniplet: A piece of knowledge or information 
that could be represented by one overhead transpar- 
ency. 

vCARD: A meta-data format that enables a 
person to be described. This is used extensively in 
commercial e-mail systems and can be thought of as 
an electronic business card. 

Virtual Me: A framework that uses Internet 
technologies to structure a personal portfolio and 
allows external users to add annotations. A sniplet is 
its basic unit, and digital assets are structured as 
multimedia objects (MMOs). 

Web Log (Blog): An online diary, typically 
authored by an individual, where unstructured com- 
ments are made and annotations can be attached. 

Wiki: A publicly modifiable bulletin board, where 
anyone can change the content. Some provide fea- 
tures so that changes can be un-done. From “wiki” 
meaning “quick” in Hawaiian, and coined by Ward 
Cunningham in 1995. 
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INTRODUCTION 
What is the Internet? 

The development of the Internet has a relatively 
brief and well-documented history (Cerf, 2001; 
Griffiths, 2001; Leiner et al., 2000; Tyson, 2002). 
The initial concept was first mooted in the early 
1960s. American computer specialists visualized the 
creation of a globally interconnected set of comput- 
ers through which everyone quickly could access 
data and programs from any node, or place, in the 
world. In the early 1970s, a research project initiated 
by the United States Department of Defense inves- 
tigated techniques and technologies to interlink packet 
networks of various kinds. This was called the 
Internetting project, and the system of connected 
networks that emerged from the project was known 
as the Internet. The initial networks created were 
purpose-built (i.e., they were intended for and largely 
restricted to closed specialist communities of re- 
search scholars). However, other scholars, other 
government departments, and the commercial sec- 
tor realized the system of protocols developed during 
this research (Transmission Control Protocol [TCP] 
and Internet Protocol [IP], collectively known as the 
TCP/IP Protocol Suite) had the potential to revolu- 
tionize data and program sharing in all parts of the 
community. A flurry of activity, beginning with the 
National Science Foundation (NSF) network 
NSFNET in 1986, over the last two decades of the 
20 th century created the Internet as we know it 
today. In essence, the Internet is a collection of 
computers joined together with cables and connec- 
tors following standard communication protocols. 

What is the World Wide Web? 

For many involved in education, there appears to be 
an interchangeability of the terms Internet and 
World Wide Web (WWW). For example, teachers 



often will instruct students to “surf the Web,” to use 
the “dub. dub. dub,” or alternatively, to find informa- 
tion “on the net” with the assumption that there is 
little, if any, difference among them. However, there 
are significant differences. As mentioned in the 
previous section, the Internet is a collection of 
computers networked together using cables, con- 
nectors, and protocols. The connection established 
could be regarded as physical. Without prior knowl- 
edge or detailed instructions, the operators of the 
connected computers are unaware of the value, 
nature, or appropriateness of the material stored at 
the node with which they have connected. The 
concepts underlying the WWW can be seen to 
address this problem. As with the Internet, the 
WWW has a brief but well-documented history 
(Boutell, 2002; Cailliau, 1995; Griffiths, 2001). Tim 
Benners-Lee is recognized as the driving force 
behind the development of the protocols, simplifying 
the process locating the addresses of networked 
computers and retrieving specific documents for 
viewing. It is best to imagine the WWW as a virtual 
space of electronic information storage. Information 
contained within the network of sites making up the 
Internet can be searched for and retrieved by a 
special protocol known as a Hypertext Transfer 
Protocol (HTTP). While the WWW has no single, 
recognizable, central, or physical location, the spe- 
cific information requested could be located and 
displayed on users’ connected devices quickly by 
using HTTP. The development and refinement of 
HTTP were followed by the design of a system 
allowing the links (the HTTP code) to be hidden 
behind plain text, activated by a click with the mouse, 
and thus, we have the creation and use of Hypertext 
Markup Language (HTML). In short, HTTP and 
HTML made the Internet useful to people who were 
interested solely in the information and data con- 
tained on the nodes of the network and were uninter- 
ested in computers, connectors, and cables. 
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BACKGROUND 

Educational Involvement 

The use and development of the Internet in the 1970s 
was almost entirely science-led and restricted to a 
small number of United States government depart- 
ments and research institutions accessing online 
documentation. The broader academic community 
was not introduced to the communicative power of 
networking until the start of the 1980s with the 
creation of BITNET, (Because It’s Time Network) 
and EARN (European Academic and Research 
Network) (Griffiths, 2001). BITNET and EARN 
were electronic communication networks among 
higher education institutes and was based on the 
power of electronic mail (e-mail). The development 
of these early networks was boosted by policy 
decisions of national governments; for example, the 
British JANET (Joint Academic Network) and the 
United States NSFNET (National Science Founda- 
tion Network) programs that explicitly encouraged 
the use of the Internet throughout the higher educa- 
tional system, regardless of discipline (Leiner et al. , 

2000) . By 1987, the number of computer hosts 
connected to networks had climbed to 28,000, and by 
1990, 300,000 computers were attached (Griffiths, 

2001) . However, the development of the World 
Wide Web and Hypertext Markup Language, com- 
bined with parallel development of browser soft- 
ware applications such as Netscape and Internet 
Explorer, led to the eventual decline of these e-mail- 
based communication networks (CREN, 2002). Educa- 
tional institutions at all levels joined the knowledge age. 

FUTURE TRENDS 

The advances in and decreasing costs of computer 
software and hardware in the 1980s resulted in 
increased use of and confidence in computer tech- 
nologies by teachers and learners. By the mid- 
1990s, a number of educational institutions were 
fully exploiting the power of the Internet and the 
World Wide Web. Search engines to locate and 
retrieve information had been developed, and a mini- 
publication boom of Web sites occurred (Griffiths, 
2001). In the early stages, educational institutions 
established simple Websites providing potential stu- 



dents with information on staff roles and responsi- 
bilities; physical resources and layout of the institu- 
tion; past, present, and upcoming events; andarange 
of policy documents. As confidence grew, institu- 
tions began to use a range of Web-based applica- 
tions such as e-mail, file storage, and exams, to make 
available separate course units or entire and pro- 
grams to a global market (Bonk et al., 1999). Cur- 
rently, educational institutions from elementary lev- 
els to universities are using the WWW and the 
Internet to supplement classroom instruction, to give 
learners the ability to connect to information (in- 
structional and other resources) , and to deliver learn- 
ing experiences (Clayton, 2002; Haynes, 2002; Rata 
Skudder et al., 2003). In short, the Internet and the 
WWW altered some approaches to education and 
changed the way some teachers communicated with 
students (McGovern & Norton, 2001; Newhouse, 
2001). There was and continues to be an explosion 
of instructional ideas, resources, and courses on the 
WWW during the past decades as well as new 
funding opportunities for creating courses with 
WWW components (Bonk, 2001; Bonket al., 1999; 
van der Veen et al., 2000). While some educators 
regard online education with suspicion and are criti- 
cal that online learning is based on imitating what 
happens in the classroom (Bork, 2001), advocates of 
online, Web-assisted, or Internet learning would 
argue that combining face-to-face teaching with 
online resources and communication provides a richer 
learning context and enables differences in learning 
styles and preferences to be better accommodated 
(Aldred & Reid, 2003; Bates, 2000; Dalziel, 2003; 
Mann, 2000). In the not-too-distant future, the use of 
compact, handheld, Internet-connected computers 
will launch the fourth wave of the evolution of 
educational use of the Internet and the WWW 
(Savill-Smith & Kent, 2003). It is envisaged that 
young people with literacy and numeracy problems 
will be motivated to use the compact power of these 
evolving technologies in learning (Mitchell & Doherty, 
2003). These students will be truly mobile, choosing 
when, how, and what they will learn. 

CONCLUSION 

The initial computer-programming-led concept of 
the Internet first mooted in the early 1960s has 
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expanded to influence all aspects of modern society. 
The development of the Hypertext Transfer Protocol 
to identify specific locations and the subsequent 
development of Hypertext Markup Language to dis- 
play content have enabled meaningful connections to 
be made from all corners of the globe. As procedures 
and protocols were established, search facilities were 
developed to speed up the discovery of resources. At 
this stage, educationalists and educational institutions 
began to use the power of the Internet to enhance 
educational activities. Although in essence, all we 
basically are doing is tapping into a bank of comput- 
ers that act as storage devices, the potential for 
transformation of educational activity is limitless. 
Increasingly, students will independently search for 
resources and seek external expert advice, and stu- 
dent-centered learning will have arrived. 
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KEY TERMS 

HTML: Hypertext Markup Language (HTML) 
was originally developed for the use of plain text to 
hide HTTP links. 



HTTP: Hypertext Transfer Protocol (HTTP) is 
a protocol allowing the searching and retrieval of 
information from the Internet. 

Internet: An internet (note the small i) is any set 
of networks interconnected with routers forwarding 
data. The Internet (with a capital I) is the largest 
internet in the world. 

Intranet: A computer network that provides 
services within an organization. 

Node: These are the points where devices (com- 
puters, servers, or other digital devices) are con- 
nected to the Internet and more often called a host. 

Protocol: A set of formal rules defining how to 
transmit data. 

TCP/IP Protocol Suite: The system of proto- 
cols developed to network computers and to share 
information. There are two protocols: the Transmis- 
sion Control Protocol (TCP) and the Internet Proto- 
col (IP). 

World Wide Web: A virtual space of electronic 
information and data storage. 
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INTRODUCTION 

Two decades ago, the U.S. Air Force asked human 
factors experts to compile a set of guidelines for 
command and control software because of software 
usability problems. Many other government agen- 
cies and businesses followed. Now hundreds of 
guidelines exist. Despite all the guidelines, however, 
most Web sites still do not use them. One of the 
biggest resulting usability problems is that users 
cannot find the information they need. In 2001, 
Sanjay Koyani and James Mathews (2001), re- 
searchers for medical Web information, found, “Re- 
cent statistics show that over 60% of Web users 
can’t find the information they’re looking for, even 
though they’re viewing a site where the information 
exists”. In 2003, Jakob Nielsen (2003), an interna- 
tionally known usability expert, reported, “On aver- 
age across many test tasks, users fail 35% of the 
time when using Web sites.” Now in 2005, Muneo 
Kitajima, senior researcher with the National Insti- 
tute of Advanced Industrial Science and Technol- 
ogy, speaks of the difficulties still present in locating 
desired information, necessitating tremendous 
amounts of time attempting to access data (Kitajima, 
Kariya, Takagi, & Zhang, to appear). 

This comes at great costs to academia, govern- 
ment, and business, due to erroneous data, lost sales, 
and decreased credibility of the site in the opinion of 
users. Since emotions play a great role in lost sales 
and lost credibility, the goal of this study was to 
explore the question, “Does the use of usability 
guidelines affect Web site user emotions?” The 
experimenter tasked participants to find information 
on one of two sites. The information existed on both 
sites; however, one site scored low on usability, and 
one scored high. After finding nine pieces of infor- 
mation, participants reported their frequency of ex- 
citement, satisfaction, fatigue, boredom, confusion, 



disorientation, anxiety, and frustration. Results fa- 
vored the site scoring high on usability. 

BACKGROUND 

In 2003, Sanjay Koyani, Robert W. Bailey, and 
Janice R. Nall (2003) conducted a large analysis of 
the research behind all available usability guidelines. 
They identified research to validate existing guide- 
lines, identify new guidelines, test the guidelines, and 
review literature supporting and refuting the guide- 
lines. They chose reviewers representing a variety 
of fields including cognitive psychology, computer 
science, documentation, usability, and user experi- 
ence. “The reviewers were all published research- 
ers with doctoral degrees, experienced peer review- 
ers, and knowledgeable of experimental design” 
(Koyani et al., 2003, p. xxi). They determined the 
strength of evidence for each guideline, based on the 
amount of evidence, type of evidence, quality of 
evidence, amount of conflicting evidence, and amount 
of expert opinion agreement with the research 
(Koyani et al., 2003, pp. xxi-xxii). They then scored 
each guideline with points for evidence as follows: 5 
= strong research support, 4 = moderate research 
support, 3 = weak research support, 2 = strong 
expert opinion, and 1 = weak expert opinion. 

The author organizes this article in the following 
groups throughout to discuss usability topics: 

• Visibility of Location: Pearrow (2000, p.167) 
states, “Users want to know where they are in 
a Web site, especially when the main site 
contains many microsites.” One way to help 
users know their location is to provide a site 
map (Danielson, 2002). A site map is an outline 
of all information on a site. Koyani et al. (2003 , 
p.62) found moderate research support that 
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site maps enhance use if topics reflect the 
user’s conceptual structure. Other aids such as 
headers and navigation paths may also be use- 
ful. 

• Consistency: WordNet, maintained by the 
Cognitive Science Laboratory of Princeton Uni- 
versity (2005), defines consistency as “a har- 
monious uniformity or agreement among things 
or parts”. The purpose of consistency is “to 
allow users to predict system actions based on 
previous experience with other system ac- 
tions” (NASA/Goddard Space Flight Center, 
1996, p. 1). Ways to make a Web site consis- 
tent include placing navigation elements in the 
same location (Koyani et al., 2003, p. 59), and 
placing labels, text, and pictures in the same 
location (Koyani et al., 2003, p. 97). Using the 
same or similar colors, fonts, and backgrounds 
for similar information will also provide consis- 
tency, as will following business, industry, and 
government standards. 

• Error Prevention and Error Recovery: 

The least costly way to prevent errors is to 
provide a well-designed Web site at the outset. 
Even though users tend to accommodate for 
inconsistencies, Koyani et al. (2003, p. 97) 
found strong research support showing a rela- 
tionship between decreased errors and visual 
consistency. In addition, Asim Ant Ozok and 
Gavriel Salvendy (2003) found well-written 
sites decrease comprehension errors. Addi- 
tional ways to prevent errors may be to provide 
Undo and Redo commands as well as Back and 
Forward commands, provide a Frequently- 
Asked Questions section, and provide help 
menus and search menus. However, even the 
best designed site will not prevent all errors. 
When errors do occur, sites need to provide 
users with ways to recover from them. Ben 
Shneiderman (1998, p. 76) advises, “Let the 
user see the source of error and give specific 
positive instructions to correct the error.” 

• Inverted Pyramid Style: For the purposes of 
this paper, the inverted pyramid style refers to 
putting the most important information at the 
top of the page. Koyani et al. (2003, p. 47) 
found moderate research support for the use- 
fulness of putting the most important informa- 



tion on the top of the page and the least used 
information at the bottom of the page. 

• Speaking the User’s Language: Speaking 
the user’s language refers to speaking the 
language of the intended audience. Koyani et 
al. (2003, p. 145) found a strong expert opinion 
to support avoiding acronyms. It follows that if 
site owners must use acronyms, they should 
provide an acronym finder and/or glossary. 
Other ways to speak the user’s language are to 
avoid jargon and to provide a search engine that 
recognizes naturalistic language, or language 
consistent with real-world conventions. 

• Easy Scanning: Koyani et al. (2003, p. 157) 
found with moderate evidence that “80% of 
users scan any new page and only 16% read 
word-by-word.” Therefore, it may be useful to 
make information easy to scan, especially when 
presenting large amounts of information. Ways 
to accomplish this may be to use a bold or 
italicized font when users need to understand 
differences in text content. Avoid underling, 
however, as users may confuse an underlined 
word or phase with a link. In addition, highlight- 
ing may make information easy to visually 
scan. However, Koyani et al. (2003, p. 77) 
advise, “Use highlighting sparingly... (use it 
for). . .just a few items on a page that is other- 
wise relatively uniform in appearance.” 

• Proper Printing: For this study, providing for 
“proper printing” means providing for the 
printed page to look the same as the presented 
page on the computer screen. For example, a 
printed copy of a Web page should not show 
the right side of the page trimmed off, if it does 
not appear as such to the user. Not all users 
know how to query the Help function to assist 
them with this problem. Although Koyani et al. 
(2003, p. 21) did not find support for providing 
proper printing in experimental research, they 
did find a strong expert opinion supporting 
proper printing. 

• Short Download Time: Download time re- 
fers to the time “to copy data (usually an entire 
file) from a main source to a peripheral device” 
(Webopedia, 2005). Koyani et al. (2003, p. 16) 
found moderate support for the usefulness of 
minimizing Web page download time. The best 
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way to decrease download time is to limit the 
number of bytes per page (Koyani, 2003, p. 17). 
Stakeholders should determine what download 
time is desired for their Web site, and designers 
should aim for that time or shorter. 

• Providing Assistance: Providing assistance 
refers to helping users use a Web site and may 
include: providing shortcuts for frequent users, 
legends describing icon buttons, a presentation 
that is flexible to user needs, a search engine 
with advanced search capabilities, providing for 
recognition rather than recall, and phone num- 
bers and addresses for technical assistance as 
well as regular contact. 

FOCUS 

The redesign team consisted of a graphic designer, a 
systems administrator, a usability expert, three hu- 
man factor visualization experts, a computer scien- 
tist, a security officer, and a human factors adminis- 
trative assistant. They met an average of once every 
three weeks for 12 months. Before and during the 
redesign process, the team determined and discussed 
their desired audience and the goals for their site. The 
team redesigned the site using an iterative process of 
making changes, then reviewing those changes with 
the team and potential users, and repeating this 
process until establishing a consensus. 

Evaluation of the Original Web Site 

Before redesign, the team asked employees who 
were representative of the Web site’s desired audi- 
ence what they liked and did not like in sites similar 
to the division site. To conduct a formative, or initial, 
evaluation the usability expert devised a “Usability 
Evaluation Checklist” (Appendix A) based on em- 
ployee comments, input from the human factors 
representatives on the redesign team, and common 
usability guidelines from usability literature. Usability 
literature included the NASA/Goddard User Inter- 
face Guidelines (NASA/Goddard Space Flight Cen- 
ter, 1996), Designing the User Interface 
(Shneiderman, 1998), Usability Inspection Methods 
(Nielsen & Mack, 2004), the Web site Usability 
Handbook (Pearrow, 2000), current journals, and 
usability sites such as Jakob Nielsen’s Alertbox. 



Redesign 

The team redesigned the original site to correct the 

problems found in the formative evaluation. They 

made the following changes. 

• Location Status: When using the original 
Web site, some employees reported they felt 
“lost” and said they wanted to know where 
they were on the site. To address this, the 
team added tabs describing the major contents 
of the site, as well as a site map. Some users 
also stated they wanted to skip back and forth 
to topic areas without having to repeatedly key 
the Back button. To address this, the team 
provided a frame that appeared at the top of all 
pages. This frame included tabs for the seven 
main areas of the site, including a tab for the 
site map. 

• Consistency: The original Web site did pro- 
vide consistent terminology and color. Head- 
ers, however, were inconsistent in appear- 
ance and placement. Therefore, the team pro- 
vided similar headers for all pages. The team 
also provided consistent font type, font size, 
background color, and layout for all pages. 

• Error Prevention and Error Recovery: 
The team added a Help feature and a Fre- 
quently-Asked Questions section for user sup- 
port. The team changed some background and 
foreground colors to provide adequate con- 
trast for trouble-free reading. In addition, they 
replaced cluttered interfaces with more 
“minimalist” interfaces. 

• Inverted Pyramid Style: The original site 
scored excellent in this category; therefore, 
the team made no changes. 

• Speaking the User’s Language: The team 
identified all acronyms, spelled them out on 
their initial appearance, and provided an acro- 
nym finder. In addition, they added a search 
engine with naturalistic language capability. 

• Easy Scanning: The team eliminated the 
right sidebar to eliminate horizontal scrolling 
and increase scanning. They also used bolding 
and italicizing to aid scanning, but used it 
sparingly to preserve attention. 

• Proper Printing: Originally, the team elimi- 
nated the right sidebar to provide easy scan- 
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ning capabilities. However, this change had a 
dual effect because it also eliminated printing 
problems with trimming off the right sides of 
pages. 

• Short Download Times: The team found 
download time was under three seconds. They 
avoided elaborate features in the redesign, and 
again checked download time on interim and 
summative evaluations. On completion, down- 
load times were still less than three seconds on 
employment computers, and less than seven 
seconds on a sample of home computers. 

• Assistance: The team provided providing drop 
down menus and a search engine with ad- 
vanced search capability. 

Method 

The experimenter randomly assigned fifteen partici- 
pants from the research laboratory’s subject pool to 
either the original Web site or the redesigned site. 
The experimenter tasked participants to find nine 
pieces of information on their assigned site. The 
experimenter gave all participants the same tasks. 
All participants completed their tasks separately 
from other participants. Each group’s site was the 
same, except for the redesign changes described 
earlier. Following the tasks, users completed a ques- 
tionnaire regarding the frequency of the following 
emotions during the experiment: fatigue, boredom, 
confusion, disorientation, anxiety, frustration, satis- 
faction, and excitement during their task assignment. 



Results 

An independent statistician ran t-tests on the data. 
Statistics showed the redesigned site was statisti- 
cally significant for higher frequencies of satisfac- 
tion and excitement than the original site. In addition, 
the redesigned site was statistically significant for 
lower frequencies of frustration, fatigue, boredom, 
and confusion (see Figure 1). The t-tests showed no 
significant differences in disorientation or anxiety. 

FUTURE TRENDS 

There is a debate over the number of participants 
needed to find usability problems. Jakob Nielsen 
(1993) and Robert A. Virzi (1992) proposed five 
participants, while Faura Faulkner (2003) proposed 
20. The author recommends future studies include 
larger groups of 20 to 30 to increase statistical 
power. 

In this study, it is possible that the positive results 
occurred because of all of the changes. However, it 
is also possible that the results occurred because of 
only some of the changes or only one of the 
changes. It is also possible that another variable 
other than usability was the cause. Other variables 
include variables in the design process, such as the 
consultation of numerous users (cognitive task analy- 
ses) during the redesign process. The author recom- 
mends further research to determine which variables, 
or combination of variables, produce the best results. 



Figure 1. Comparison of original site to revised site 
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The author also recommends future research to 
address individual personality differences and situ- 
ational differences. Examples of questions include: 
“Do certain guidelines produce better results for 
situational differences such as chaotic or stressful 
situations, than they do for calm situations?” In 
addition, “Do certain guidelines produce better re- 
sults for different personality types?” For example, 
some users may enjoy the challenge of finding 
“hard-to-find” information, while others may be 
annoyed or frustrated by the same situation. 

The author also cautions that self-reported emo- 
tions may not be reflective of true emotions. Some 
participants may have reasons to present them- 
selves as being positive or at ease with a site when 
they are not. For example, many users feel that ease 
of use is a sign of intelligence so they may want to 
appear at ease. They may also want to appear 
positive or happy to please the tester. One emerging 
trend in software research that may address prob- 
lems with self-reporting is biometric research. Bio- 
metric data includes heart activity, sweat gland 
activity, blood pressure, and brain activity. Although 
these data are more objective than self-reported 
emotion, users may have different emotions than 
they normally do simply because of the apparatus. 
Therefore, the author recommends both subjective 
and objective studies to capitalize on the benefits of 
both types of software metrics. 

CONCLUSION 

Coverage of the topic of usability is about 20 years 
old. Farge scale studies of the complex relationships 
between emotions and usability, however, are only 
recently emerging. This study attempts to under- 
stand these relationships. In answer to the question, 
“Does the Use of Usability Guidelines Affect Web 
Site User Emotions?” this study answers a tentative 
“Yes.” However, caveats do apply. First, this was a 
small study with only seven to eight participants in 
each group. Second, when experimenters measure 
emotions by self-report, the reports may not be 
accurate. Third, situational and individual differ- 
ences need further research in order to generalize to 
additional types of situations and different personal- 
ity types. Studying user emotions when users are 
trying to find information is challenging. However, 



our users deserve a fulfilling experience. The ben- 
efits of helping all Web users find the information 
they need and want should be well worth the chal- 
lenge in increased effectiveness, efficiency, and 
satisfaction. 



DISCLAIMER 

The views expressed in this article are those of the 
author and do not reflect the official policy or 
position of the United States Air Force, Department 
of Defense, or the U.S. Government. 
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KEY TERMS 

Consistency: Consistency in Web sites refers 
to keeping similar Web pages similar in their look and 
feel. Examples of ways to achieve consistency 
include using the same or similar colors, font, and 
layout throughout the site. 

Easy Scanning: Sanjay Koyani, Robert W. 
Bailey, and Janice R. Nall (2003, p. 157) found with 
moderate evidence that “80% of users scan any new 
page and only 16% read word-by-word.” Therefore, 
it may be useful to make information easy to scan 
when presenting large amounts of information. Ways 
to accomplish this may be to use a bold or italicized 
font so users can quickly pick up differences in 
content. Avoid underling, however, as the user may 
confuse an underlined word or phrase with a link. In 
addition, highlighting may make information easy to 
visually scan. However, Koyani et al. (2003, p. 77) 
advise, “Use highlighting sparingly (using) just a few 
items on a page that is otherwise relatively uniform 
in appearance.” 



Minimalist Design: Refers to providing simple 
and easy to read screen designs. When Web designs 
are not minimalist, they may cause cognitive over- 
load, or the presence of too much information for 
users to process. Keeping pages uncluttered and 
chunking information into categories are examples 
of ways to provide a minimalist design. 

Providing for Recognition rather than Re- 
call: Recognition is easier than recall, as evidenced 
in most “multiple-choice” questions compared to 
“fill-in-the -blank” questions. For example, when users 
return to a Web site, they may not recall where 
certain information occurred, although they may 
recognize it when they see it. Examples of ways to 
provide for recognition rather than recall include 
providing drop-down menus, providing for book 
marking, and providing a search engine. When pro- 
viding a search engine, most experts recommend 
explaining its use as well as providing for advanced 
searching. This accommodates the needs of novice 
users as well as advanced users. 

Short Download Time: Download time refers 
to the time “to copy data (usually an entire file) from 
a main source to a peripheral device” (Webopedia, 
2005). Sanjay Koyani, Robert W. Bailey, and Janice 
R. Nall (2003, p. 16) found moderate support for the 
usefulness of minimizing Web page download time. 
The best way to decrease download time is to limit 
the number of bytes per page (Koyani, 2003, p. 17). 

Speaking the User’s Language: Refers to 
speaking the language of the intended Web audi- 
ence. It means avoiding jargon, acronyms, or system 
terms that some of the intended audience may not 
understand. If you must use jargon, acronyms or 
system terms, provide a glossary, and/or an acronym 
finder. Another way to speak the user’s language is 
to ensure your search engine recognizes naturalistic 
language. 

Visibility of Location: In the field of Web 
usability, visibility of location refers to letting users 
know where they are in a Web site as well as the 
status of their inputs and navigation. Examples of 
ways to increase visibility of location include provid- 
ing a site map, headers, and navigation paths. 
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APPENDIX A 



USABILITY EVALUATION CHECKLIST 

Patricia A. Chalmers, Ph.D. 

U.S. Air Force 

Evaluator: Score every item as a 1, 2, 3, 4, or 5 with points as follows: 

1 = Never occurs 

2 = Occurs rarely 

3 = Occurs sometimes 

4 = Occurs much of the time 

5 = Occurs all the time 

Visibility of Location 

The interface: 

enables users to know where they are within the system 
provides a link to the Web site’s “home page” 
provides the user with a shortcut to go back to topics 

Consistency 

The interface uses: 

consistent terminology 

consistent color, for same or similar topic headings 
consistent background for same or similar pages 
consistent layout for pages 
consistent colors for accessed links 
consistent colors for unaccessed links 

Error Prevention and Error Recovery 

The interface provides: 
a Help Menu 

“Frequently-Asked Questions” 

The interface is uncluttered 

The “Inverted Pyramid” Style 

The interface locates: 

the most important information at the top 
the least important information at the bottom 

Speaking the User’s Language 

The interface uses: 

language consistent with real-world conventions 

natural language (rather than jargon, acronyms, or system terms) 

If jargon is used, there is a glossary 

If acronyms are used, the interface has an “acronym finder” 

Easy Scanning 

Key words and phrases are easily visible (for example, by highlighting) 

The interface organizes information into manageable chunks 

The user can see the screen without scrolling horizontally with 640 x 480 resolution on a 14” monitor 
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Proper Printing 

The user can print selected pages (rather than everything) 

Printed contents are free of right side”trimmings” 

Short Download Times 

The Web site downloads within seven seconds 

Provide Assistance 

The interface provides: 

shortcuts for frequent users 

legends describing icon buttons 

presentation that is flexible to user needs 

a search engine 

advanced search capabilities 

recognition rather than recall 

e-mail addresses for contact 

phone numbers for contact 

postal addresses for traditional mail contact 

e-mail addresses for technical assistance 

phone numbers for technical assistance 
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INTRODUCTION 

In this article, we discuss the concept of elastic 
interfaces, which was originally introduced by Masui, 
Kashiwagi, and Borden (1995) a decade ago for the 
manipulation of discrete, time-independent data. It 
gained recent attraction again by our own work in 
which we adapted and extended it in order to use it 
in a couple of other applications, most importantly in 
the context of continuous, time-dependent docu- 
ments (Hurst & Gotz, 2004; Hurst, Gotz, & Lauer, 
2004). The basic idea of an elastic interface is 
illustrated in Figure 1 . Normally, objects are moved 
by dragging them directly to the target position 
(direct positioning). With elastic interfaces, the ob- 
ject follows the cursor or mouse pointer on its way 
to the target position with a speed 5 that is a function 
of the distance d between the cursor and the object. 
They are called elastic because the behavior can be 
explained by the rubber-band metaphor, in which the 
connection between the cursor and the object is seen 
as a rubber band: The more the band is stretched, the 
stronger the force between the object and the cursor 
gets, which makes the object move faster. Once the 
object and cursor come closer to each other, the 
pressure on the rubber band decreases, thus slowing 
down the object’s movement. 



In the next section we describe when and why 
elastic interfaces are commonly used and review 
related approaches. Afterward, we illustrate differ- 
ent scenarios and applications in which elastic inter- 
faces have been used successfully for visual data 
browsing, that is, for skimming and navigating through 
visual data. First, we review the work done by Masui 
(1998) and Masui et al. (1995) in the context of 
discrete, time-independent data. Then we describe 
our own work, which applies the concept of elastic 
interfaces to continuous, time-dependent media 
streams. In addition, we discuss specific aspects 
considering the integration of such an elastic behav- 
ior into common GUIs (graphical user interfaces) 
and introduce a new interface design that is espe- 
cially useful in context with multimedia-document 
skimming. 

BACKGROUND 

Direct positioning is usually the approach of choice 
when an object has to be placed at a specific target 
position. However, elastic interfaces have advan- 
tages in situations in which the main goal is not to 
move the object itself, but in which its movements 
are mapped to the motion of another object. The 



Figure 1. Illustration of the concept of elastic interfaces 
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Figure 2. Illustration of the scaling problem of scroll bars and slider 




THE SCALING PROBLEM OF 
SCROLL BARS AND SLIDERS: 

If a document is very long, it is impossible to 
map any position within the document onto the 
scale of the corresponding slider or scroll bar. 

As a consequence, parts of the document’s 
content can not be accessed directly with the 
slider and the resulting jumps during scrolling 
lead to a jerky visualization which is usually 
disliked by the users and considered as 
disturbing and irritating. 



most typical examples for such a case are scroll bar 
and slider interfaces, for which the dragging of the 
scroll bar or slider thumb to a target position is 
mapped to the corresponding movements within an 
associated document. One common problem with 
scroll bars and sliders is that a random document 
length has to be matched to their scale, which is 
limited by window size and screen resolution (the 
scaling problem; compare Figure 2). Hence, if the 
document is very long, specific parts of the file are 
not accessible directly because being able to access 
any random position of the document would require 
movements of the slider thumb on a subpixel level. 
This is impossible with direct manipulation since a 
pixel is the smallest unit to display (and thus to 
manipulate) on the screen. In addition, the move- 
ment of the document’s content during scrolling 
becomes rather jerky, which is usually considered 
irritating and disturbing by users. This is where 
elastic interfaces come into play: Since the scrolling 
speed is indirectly manipulated based on the map- 
ping of the distances between cursor and thumb to a 
corresponding speed, navigation becomes indepen- 
dent of the scroll bar’s or slider’s scale and thus 
independent of the actual length of the document. If 
the function for the distance-to-speed mapping is 
chosen appropriately, subpixel movements of the 
thumb and thus slow scrolling on a finer scale can be 
simulated. 

Other solutions to solve the scaling problem have 
been proposed in the past, mainly as extensions or 
replacements of slider interfaces. The basic func- 



tionality of a slider is to select a single value or entry 
by moving the slider thumb along the slider bar, 
which usually represents an interval of values. A 
typical selection task is, for example, the modifica- 
tion of the three values of an RGB color by three 
different sliders, one for each component. If visual 
feedback is given in real-time, sliders can also be 
used for navigation either in a continuous, time- 
dependent media file, such as a video clip, or to 
modify the currently visible part of a static, time- 
independent document whose borders expand be- 
yond the size of its window (similar to the usage of 
a scroll bar). In both cases, again, the user drags the 
thumb along the bar in order to select a single value. 
In the first case, this value is a specific point in time 
(or the corresponding frame of the video), and in the 
second case, it is a specific position in the document 
(and the task is to position the corresponding content 
within the visible area of the screen). 

Most approaches that try to avoid the scaling 
problem have been proposed either for selection 
tasks or for scrolling interfaces that enable naviga- 
tion in static, time-independent data. The most well 
known is probably the Alphaslider introduced by 
Ahlberg and Shneiderman (1994). Here, the thumb 
of a slider or a scroll bar is split into three different 
areas, each of which allows for navigation at a 
different granularity level. Ayatsuka, Rekimoto, and 
Matsuoka (1998) proposed the Popup Vernier in 
which the user is able to switch between different 
scrolling resolutions by using additional buttons or 
key s . Instead of relying on different granularities for 
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the whole slider scale, the TimeSlider interface intro- 
duced by Koike, Sugirua, and Koseki (1995) works 
with a nonlinear resolution of the scale. This way, 
users can navigate through a document at a finer 
level around a particular point of interest. 

So far, only few projects have dealt with the 
scaling problem in the context of navigation through 
continuous, time-dependent data such as video files. 
One of the few examples is the Multi-Scale Timeline 
Slider introduced by Richter, Brotherton, Abowd, 
and Truong (1999) for browsing lecture recordings. 
Here, users can interactively add new scales to a 
slider interface and freely modify their resolution in 
order to be able to navigate through the correspond- 
ing document at different granularity levels. Other 
approaches have been proposed in the context of 
video editing (see Casares et al., 2002, for example). 
However, video editing is usually done on a static 
representation of the single frames of a video rather 
than on the continuous signal, which changes over 
time. Thus, approaches that work well for video 
editing cannot necessarily be applied for video brows- 
ing (and vice versa). In Hurst, Gotz, and Jarvers 
(2004), we describe two slider variants, namely the 
ZoomSlider and the NLslider, which allow for inter- 
active video browsing at different granularity levels 
during replay. Both feature some advantages com- 
pared to other approaches, but unfortunately they 
share some of their disadvantages as well. 

Most of the approaches described above rely on 
a modification of the scale of the slider in order to 
enable slower scrolling or feature selection at a finer 
granularity. However, this has one significant disad- 
vantage: Adjusting the slider’s scale to a finer reso- 
lution makes the slider bar expand beyond the bor- 



ders of the corresponding window, thus resulting in 
situations that might be critical to handle when the 
thumb reaches these borders. Approaches such as 
the Alphaslider circumvent this problem by not 
modifying the slider’ s scale explicitly, but instead by 
doing an internal adaptation in which the move- 
ments of the input devices are mapped differently 
(i.e., with a finer resolution) to the movements of 
the corresponding pointer on the screen. However, 
this only works with a device that controls the 
pointer on the screen remotely, such as a mouse, but 
it can be critical in terms of usability if a device is 
used that directly interacts with the object on the 
screen, such as a pen on a touch screen. In the 
following two sections, we describe how elastic 
interfaces can be used successfully to avoid both of 
the problems just mentioned as well as the scaling 
problem. 



E 



ELASTIC INTERFACES APPLIED TO 
STATIC, TIME-INDEPENDENT DATA 

In their original work about elastic interfaces, Masui 
et al. (1995) introduced the so-called FineSlider to 
solve the scaling problem in relation to the task of 
value selection. The FineSlider is illustrated in Fig- 
ure 3 together with a possible distance-to-speed 
mapping that defines how the distance between the 
current position of the slider’ s thumb and the mouse 
pointer is mapped to the actual scrolling speed. Its 
feasibility and usefulness were proven in a user 
study in which the test persons had to select a single 
value from a list of alphabetically sorted text entries 
of which only one entry was visible at a time (Masui 



Figure 3. FineSlider interface and corresponding distance-to-speed mapping 
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et al.). This task is similar to a random-selection task 
in which a single value is selected from a somehow 
sorted list of values. However, the authors describe 
usages of the FineSlider concept in relation to some 
other applications as well, including the integration 
of the elastic scrolling functionality into a regular 
scroll bar in order to solve the scaling problem when 
navigating through static documents such as text or 
images. An actual implementation of this is pre- 
sented in Masui (1998), in which the LensBar inter- 
face for the visual browsing of static data was 
introduced. Among other functionalities, the LensBar 
offers elastic navigation by moving a scroll-bar-like 
thumb and thus enables skimming through a docu- 
ment at a random granularity. In addition to these 
navigation tasks, Masui et al. describe the usage of 
elastic interfaces for the modification of the position 
of single points in a 2-D (two-dimensional) graphics 
editor. In such a situation, direct positioning is nor- 
mally used because the task is not to remotely 
modify an object, but to reposition the object itself 
(compare Figure 1 ). However, elastic interfaces can 
still be useful in such a situation, for example, if 
accurate positioning on the pixel level is hard to do, 
which is sometimes the case with pen-based input 
devices and touch screens. 

All these tasks for the selection of values, the 
navigation in documents, or the repositioning of 
objects have been discussed and realized in relation 
to time-independent, static data. However, the con- 
cept of elastic interfaces can also be applied to 
continuous visual data streams, as we describe in the 
next section. 



ELASTIC INTERFACES APPLIED TO 
TIME-DEPENDENT DATA 

If a media player and the data format of the respec- 
tive file support real-time access to any random 
position within the document, a slider can be used to 
visually browse the file ’ s content in the same way as 
a scroll bar is used to navigate and skim discrete 
data, for example, a text file. This technique, which 
is sometimes referred to as random visible scrolling 
(Hurst & Muller, 1999), has proven to be a very 
easy, convenient, and intuitive way for the visual 
browsing of continuous data streams. However, the 
scaling problem, which appears in relation to dis- 



crete documents (compare Figure 2), occurs here as 
well. In fact, it is sometimes considered even more 
critical because the resulting jerky visual feedback 
can be particularly disturbing in the case of a con- 
tinuous medium such as the visual stream of a video. 
In Hurst, Gotz, et al. (2004), we applied the concept 
of elastic interfaces described above to video brows- 
ing. The conceptual transfer of elastic interfaces 
from static data to continuous data streams is straight- 
forward. Instead of selecting single values or posi- 
tions within a static document, a value along a 
timeline is manipulated. In a video, this value relates 
to a single frame, which is displayed instantly as a 
result of any modification of the slider thumb caused 
by the user. However, because of the basic differ- 
ences between these two media types (i.e., discrete 
values or static documents vs. time-dependent, con- 
tinuous data streams), it is not a matter of course that 
the concept works as well for continuous media as 
it does for discrete data in terms of usability. For 
example, when using a scroll bar to navigate through 
some textual information, it is not only a single entity 
such as a line or even just a word that is usually 
visible to the user, but there is also some context, that 
is, all the lines of the text that fit into the actual 
window size. With a continuous visual data stream, 
for example, a video, just one frame and therefore no 
context is shown at a time. While this is also true for 
value selection using a slider, those values are 
usually ordered in some way, making navigation and 
browsing much easier. Ramos and Balakrishnan 
(2003) introduced the PVslider for video browsing, 
which features some sort of elastic behavior as well. 
However, the authors assume a slightly different 
interpretation of the term elastic by measuring the 
distance (i.e., the tension of a virtual rubber band) 
between the mouse pointer and a fixed reference 
point instead of a moving object (the slider thumb), 
which makes their approach more similar to an 
elastic fast-forward and rewind functionality than to 
the idea of an elastic interface as it was introduced 
originally by Masui et al. (1995). 

In an informal study with the interface presented 
on the left side of Figure 5, we showed the useful- 
ness and feasibility of the original elastic-slider 
approach in relation to video browsing (Hurst, Gotz, 
et al., 2004). With this implementation, users are 
able to visually skim through the video very fast (by 
moving the mouse pointer quickly away from the 
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Figure 4. Illustration of the elastic panning approach 



( a ) Initial clicking position 
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(a) The initial clicking position is 
associated with the current position 
of the original thumb 

(b) Moving the pointer to the left or 
right initiates backward or forward 
browsing, respectively 

(c) Mapping actual to virtual slider 
scale results in a non-linear 
resolution of the virtual scale 
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thumb) or very slow (by moving the mouse pointer 
closer to the thumb), even on the level of single 
frames. However, we also identified some problems 
with this approach. Some users complained that they 
have to focus on two things when browsing the 
video: the actual content as well as the slider widget. 
While this is also true with a regular slider, it is more 
critical in the case of an elastic interface because the 
scrolling speed depends directly on the pointer’ s and 
the thumb’s movements. As a consequence, acci- 
dental changes of the scrolling direction happen 
quite frequently when a user tries to reduce the 
scrolling speed. In such a case, both the thumb and 
the pointer are moving toward each other. Since a 
user is more likely to look at the content of the 
document in such a situation rather than at the slider 
thumb, it can happen quite easily that the pointer 
accidentally moves behind the thumb, thus resulting 
in an unwanted change of the scrolling direction. 

These observations were the main reason for us 
to introduce a modification of the concept of elastic 
interfaces called elastic panning. With elastic pan- 
ning, browsing gets activated by clicking anywhere 
on the window that shows the content of the docu- 
ment. As a consequence, an icon appears on the 
screen that represents a slider thumb and is associ- 
ated with the current position within the file (com- 



pare Figure 4a). Moving the slider to the right or left 
enables forward or backward browsing, respec- 
tively, along a virtual scale that extends to both sides 
of the virtual slider thumb (compare Figure 4b). The 
resulting movements of the thumb, and thus the 
scrolling speed, are similar to the elastic slider 
illustrated in Figure 3 : Scrolling is slow if the thumb 
and pointer are close to each other, and the speed 
increases with the distance between those two 
objects. Distance is only measured horizontally along 
the virtual slider scale. Vertical movements of the 
mouse pointer do not influence scrolling behavior, 
but only make the visualization of the virtual thumb 
and scroll bar follow the pointer’s movements (i.e., 
the widgets on the screen are “glued” to the pointer). 
The beginning and the end of the virtual scale are 
mapped to the borders of the player window. This, 
together with the association of the initial clicking 
position with the actual position in the document at 
that time, can result in a mismatch of the scales to 
both sides of the slider thumb (compare Figure 4c). 
However, since the original motivation of elastic 
interfaces was to make scrolling independent of the 
actual length of a document (and thus of the slider’s 
scale), this mismatch does not influence the overall 
scrolling behavior. It is therefore uncritical as we 
also confirmed in the evaluation presented in Hurst, 
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Figure 5. Snapshots of a video player featuring an elastic slider (left) and elastic panning (right), 
respectively 





Enlargement of 
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Gotz, et al. (2004). However, it was observed that 
some users were irritated by the change of the scale, 
which is why we revised the interface design by 
showing only the part of the virtual slider bar facing 
the actual scrolling direction and by not representing 
the actual values on the scale but only relative 
positions (compare Figure 5, which shows a snap- 
shot of the actual implementation of the revised 
interface design). The improvement resulting from 
this revision was confirmed by the test persons when 
we confronted them with the revised design. In 
addition, we ran a couple of experiments, for ex- 
ample, with different distance-to-speed mappings 
(compare Figure 3, right side), in order to optimize 
the parameters of the implementation. 

In Hurst, Gotz, et al. (2004), we present a quali- 
tative usability study in which 10 participants had to 
perform a search task with elastic panning as well as 
with a standard slider interface. This comparative 
study showed the feasibility of the proposed ap- 
proach as well as its usefulness for video browsing. 
It circumvents the problem of accidental direction 
changes that frequently appears with the interface 
illustrated in the first snapshot in Figure 5 . Generally, 
elastic panning was considered by the participants 
as intuitive as well as easy to learn and operate. In 
addition to it solving the scaling problem, most users 
appreciated that the movements were less jerky with 
elastic panning than with the original slider. Some 
noted that they particularly liked the ability to quickly 
navigate the file and to be able to easily slow down 
once they get closer to a target position. Another 
feature that was highly appreciated is the ability to 
click directly on the window instead of having to use 



a separate slider widget for browsing. Users liked 
that they did not have to continuously change be- 
tween looking at the video and the slider widget, but 
could focus on the player window all the time. This 
seems to be a big advantage on small devices as well, 
for which display space is limited and hence videos 
are often played in full-screen mode. 

In addition to video browsing, we also applied 
elastic panning successfully to the task of the visual 
browsing of recorded lectures, that is, visual streams 
that contain recorded slides used in lectures as well 
as handwritten annotations made on the slides (Hurst 
& Gotz, 2004). Here, we are dealing with a mixture 
of static data, that is, the prepared slides that contain 
text, graphics, and still images, and a continuous 
signal, that is, the annotations that are made and 
recorded during the lecture that are replayed by the 
player software in the same temporal order as they 
appeared during the live event. Especially if there 
are a lot of handwritten annotations, for example, if 
a mathematical proof was written down by the 
lecturer, the scaling problem can become quite 
critical. Again, users — in this case students who use 
recorded lectures for recapitulation and exam prepa- 
ration — highly appreciated that with this interface 
they are able to quickly skim through the visual signal 
of the recording as well as slow down easily to 
analyze the content in more detail. Being able to 
scroll and navigate at the smallest possible level 
allows the users, for example, to access every 
position in which some event happened in the visual 
stream, such as a small annotation made in a graphi- 
cal illustration, and to start replay at the correspond- 
ing position in the lecture recording easily. 
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FUTURE TRENDS 

The fact that traditional sliders and scroll bars do not 
scale to large document sizes is well known and has 
been studied extensively in the past. Different solu- 
tions to solve or circumvent this problem have been 
proposed throughout the ’90s. However, more re- 
cent developments in input and output devices as 
well as an enhanced spectrum of data that we have 
to deal with today make it necessary to refocus on 
this issue. For example, portable devices (such as 
very small laptops) and handheld devices (such as 
PDAs, portable digital assistants) have a restricted 
screen resolution, thus amplifying the scaling prob- 
lem. With pen-based input becoming more popular 
due to devices such as the tablet PC (personal 
computer) and PDAs, not all solutions that were 
proposed in the past are applicable anymore. Last 
but not least, new media types, especially continu- 
ous, time-dependent data, call for different interface 
designs. 

The work on elastic interfaces that we summa- 
rized in this article is a first step in supporting these 
issues because it showed that this interaction ap- 
proach can be used not only for static, time-indepen- 
dent data, but for continuous, time-dependent data 
streams as well. In addition, it offers great potential 
for usage in relation with pen-based input devices 
and it is also applicable to very small screen sizes 
such as PDAs, although a final proof of this claim is 
left for future work. However, first experiments 
with elastic panning and pen-based interaction on a 
tablet PC have been very promising. In addition to 
visual data, which is the focus of this article, today 
we are more and more faced with digital audio 
documents as well. In Hurst and Lauer (in press), 
we describe how the concept of elastic interfaces 
can also be applied to flexible and easy speech 
skimming. Detailed descriptions about elastic audio 
skimming as well as first usability feedback can be 
found in Hurst, Lauer, and Gotz (2004a, 2004b). 
Probably the most exciting but also the most difficult 
challenge for future work in this area is to bring these 
two approaches together in one interface, thus en- 
abling real multimodal navigation in both acoustic as 
well as visual data at the same time. 



CONCLUSION 

This article described how elastic interfaces can be 
used for visual data browsing and feature selection. 
First, we reviewed the original work done by Masui 
(1998) and Masui et al. (1995) on elastic interfaces 
in relation to static, time-independent data. Our own 
work includes the successful transition of this con- 
cept from the static, time-independent domain to 
continuous, time-dependent media streams, such as 
video as well as mixed-media streams. In addition, 
we introduced a modification in the interface design 
that proved to lead to a clear improvement in usabil- 
ity, especially in relation to continuous visual media 
streams. Current and future work includes evalua- 
tions of the discussed concepts with different input 
devices, in particular pen-based input, as well as in 
combination with acoustic-data browsing. 
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KEY TERMS 

Elastic Interfaces: Interfaces or widgets that 
manipulate an object, for example, a slider thumb, 
not by direct interaction but instead by pulling it along 
a straight line that connects the object with the 
current position of the cursor. Movements of the 
object are a function of the length of this connection, 
thus following the rubber-band metaphor. 

Elastic Panning: An approach for navigation in 
visual data, which has proven to be feasible not only 
for the visual browsing of static, time-independent 
data, but for continuous, time-dependent media 
streams as well. Similar to the LineSlider, it builds on 
the concept of elastic interfaces and therefore solves 
the scaling problem, which generally appears if a 
long document has to be mapped on a slider scale 
that is limited by window size and screen resolution. 

FineSlider: A special widget introduced by 
Masui et al. (1995) for navigation in static, time- 
independent data. It is based on the concept of 
elastic interfaces and therefore solves the scaling 
problem, which generally appears if a long document 
has to be mapped on a slider scale that is limited by 
window size and screen resolution. 

Random Visible Scrolling: If a media player 
and a data format support real-time random access 
to any position within a respective continuous, time- 
dependent document, such as a video recording, a 
common slider interface can be used to visually 
browse the file’s content in a similar way as a scroll 
bar is used to navigate and browse static informa- 
tion, such as text files. This technique is sometimes 
referred to as random visible scrolling. 

Rubber-Band Metaphor: A metaphor that is 
often used to describe the behavior of two objects 
that are connected by a straight line, the rubber band, 
in which one object is used to pull the other one 
toward a target position. The moving speed of the 
pulled object depends on the length of the line 
between the two objects, that is, the tension on the 
rubber band: Longer distances result in faster move- 
ments, and shorter distances in slower movements. 
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Scaling Problem: A term that can be used to 
describe the problem of scroll bars and sliders not 
scaling to large document sizes. If a document is 
very long, the smallest unit to move the scroll bar or 
slider thumb on the screen, that is, one pixel, already 
represents a large jump in the file, thus resulting in 
jerky visual feedback that is often considered irritat- 
ing and disturbing, and in the worst case leads to a 
significant loss of information. 



Visual Data Browsing: A term generally used 
to summarize all kinds of interactions involved in 
visually skimming, browsing, and navigating visual 
data in order to quickly consume or identify the 
corresponding content or to localize specific infor- 
mation. Visual data in this context can be a static 
document, such as a text file, graphics, or an image, 
as well as a continuous data stream, such as the 
visual stream of a video recording. 
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INTRODUCTION 

Recent trends in HCI have sought to widen the 
range of use qualities beyond accessibility and us- 
ability. The impetus for this is fourfold. First, some 
argue that consumer behaviour has become more 
sophistic ated and that people expect products to give 
them a number of life-style benefits. The benefits 
that products can give people include functional 
benefits (the product does something) and 
suprafunctional benefits (the product expresses 
something). Engagability is thus important in under- 
standing people’ s preferences and relationships with 
products. Second, technological advances offer the 
possibility of designing experiences that are like 
those in the real world. Engagability is therefore 
important in providing an evaluative and exploratory 
approach to understanding “real” and “virtual” ex- 
periences. Third, the experiences that people value 
(e.g., sports) require voluntary engagement. Thus, 
engagability is important in designing experiences 
that require discretionary use. Lastly, the product 
life cycle suggests the need to look beyond design to 
engagement. Products change from their initial pro- 
duction, through distribution to consumption. Each 
phase of this life cycle contains decision-making 
activities (e.g., purchasing, design, etc.). Engagability 
is an important research focus in explaining stake- 
holders’ values in making these decisions. As such, 
engagability research seeks to understand the na- 
ture of experience in the real and virtual worlds. The 
activities that people become engaged with are often 
complex and social and thus challenge the traditional 
HCI focus on the single task directed user. Impor- 
tant application areas for this inquiry are learning, 
health and sport, and games. 

BACKGROUND 

Engagability research has primarily come from out- 
side of HCI. It includes research into motivation, 



education, and understanding human experience. 
For example, the feeling of being engaged in expe- 
rience has been investigated by Csikszentmihalyi 
(1991, p. 71), who describes the qualities of optimal 
experience and flow: 

A sense that one’s skills are adequate to cope 
with the challenges at hand, in a goal-directed, 
rule-bound action system that provides clear 
rules as to how well one is performing. 
Concentration is so intense that there is no 
attention left over to think about anything 
irrelevant, or to worry about problems. Self- 
consciousness disappears, and the sense of timing 
becomes distorted. 

Norman’s work with Andrew Ortony and Will- 
iam Revelle (Norman, 2003) proposes that people 
are engaged in compelling experiences at three 
levels of brain mechanism comprising: 

The automatic, prewired layer called the visceral 
level; the part that contains the brain processes 
that control everyday behaviour, known as the 
behavioural level and the contemplative part of the 
brain, or the reflective level. (Norman 2003, p. 6) 

Furthermore, These three components interweave 
both emotions and cognition. (Norman 2003, p. 6) 

Jordan focuses on hedonic use qualities and 
states that, “Games are an example of a product type 
that are designed primarily to promote emotional 
enjoyment through providing people with a cognitive 
and physical challenge.” He goes on to say that, 
“well-designed games can engage players in what 
they are doing. Instead of having the feeling that 
they are sitting in front of the television controlling 
animated sprites via a control pad, they may feel that 
they are playing soccer at Wembley Stadium or 
trying to escape from a monster in some fantasy 
world” (Jordan, 2000, p. 45). 
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Dunne’s (1999) “aesthetics of use” and Laurel’s 
concept of engagement (from Aristotle) describe a 
similar phenomenon: “Engagement ... is similar in 
many ways to the theatrical notion of the “willing 
suspension of disbelief,” a concept introduced by the 
early nineteenth century critic and poet Samuel 
Taylor Coleridge” (Laurel, 1991, p. 113). 

Engagement in relation to learning is proposed by 
Quinn (1997). He suggests that engagement comes 
from two factors — “interactivity” and 
“embeddedness.” Jones, Valdez, Nowakowski, and 
Rasmussen (1994) describe engaged learning tasks 
as “challenging, authentic, and multidisciplinary. Such 
tasks are typically complex and involve sustained 
amounts of time. . . and are authentic.” Jones, Valdez, 
Nowakowski, and Rasmussen (1995) go on to sug- 
gest six criteria for evaluating educational technol- 
ogy in the context of engaged learning: 

1. Access 

2. Operability 

3. Organisation 

4. Engagability 

5. Ease of use 

6. Functionality 

FUTURE TRENDS 

Engagability was first applied to HCI design by 
Knight and Jefsioutine (2003). The meaning and 
impact of engagability was explored at the 1 st Inter- 
national Design and Engagability Conference in 
2004 (Knight & Jefsioutine, 2004). Papers related to 
design practise and the qualities of engaging experi- 
ences. Papers presented examples of engagement 
in the context of: 

1. Community 

2. Creativity 

3. Design 

4. Education 

5. Emotion 

6. Health 

7. Physiology 

8. Real and virtual experience 

9. Identity 

10. Well-being 



CONCLUSION 

Many researchers argue for design to go beyond 
usability and there is a consensus to move to hedonic 
use qualities. The widening of HCI research and 
design into the realms of emotion is to be welcomed 
and engaging products and services offer the prom- 
ise of richer interactions. However, engagement 
also requires an ethical as well as aesthetic approach 
to design. Including human values in design means 
not only better products but also transformative 
qualities as well. 
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KEY TERMS 

Engagability: A product or service use quality 
that provides beneficial engagement. 



Functional Use Qualities: The quality of a 
product to deliver a beneficial value to the user. 

Hedonic Use Qualities: The quality of a prod- 
uct to deliver pleasurable value to the user. 

Product Life Cycle: The evolution of a product 
from conception onward. 

Suprafunctional Use Qualities: Qualities ex- 
perienced in interacting with a product or service 
that do not have an immediate instrumental value. 
Suprafunctional user qualities include aesthetics 
and semantics that influence the user experience 
but are not the primary goal of use. 

Use Quality: The value of the experience of 
interacting with a product or service. 
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INTRODUCTION 

The goal of HCI research and design has been to 
deliver universal usability. Universal usability is 
making interfaces to technology that everyone can 
access and use. However, this goal has been chal- 
lenged in recent times. Critics of usability (e.g., Eliot, 
2002) have argued that usability “dumbs down” the 
user-experience to the lowest common denomina- 
tor. The critics propose that focusing on ease of use 
can ignore the sophistication of expert users and 
consumers. At the same time, researchers have 
begun to investigate suprafunctional qualities of 
design including pleasure (Jordan, 2000), emotion 
(Norman, 2003), and fun. While recent discussions 
in HCI have bought these questions to the surfaces, 
they relate to deeper philosophical issues about the 
moral implications of design. Molotch (2003, p. 7), 
states that: 

Decisions about what precisely to make and 
acquire, and when, where, and how to do it 
involve moral judgements about what a man is, 
what a woman is, how a man ought to treat his 
aged parents ...how he himself should grow old, 
gracefully or disgracefully, and so on. 

One response to this moral dilemma is to promote 
well-being rather than hedonism as an ethical design 
goal. 

BACKGROUND 

The Western ethical tradition goes back to ancient 
Greece. Ethics develop the concept of good and bad 
within five related concepts: 

1 . Autonomy 

2. Benefiance 

3. Justice 



4. Non-malefiance 

5. Fidelity 

At an everyday level, ethics (the philosophy of 
morality) informs people about the understanding of 
the world. The motivation for ethical behaviour goes 
beyond the gratification of being a good person. 
Social cohesion is based on a shared understanding 
of good and bad. Bond (1996, p. 229) suggests that 
ethics tries to: “Reconcile the unavoidable separate- 
ness of persons with their inherently social nature 
and circumstances.” 



DESIGN 

Design is the intentional creation of utilitarian ob- 
jects and embodies the values of the maker. Harvey 
Molotch (2003, p. 11) argues that products affect 
people: 

At the most profound level, artefacts do not just 
give off social signification but make meaning of 
any sort possible. ..objects work to hold meaning 
more or less, less still, solid and accessible to 
others as well as one’s self. 

The moral responsibility of design has led some 
(e.g., William Morris) towards an ethical design 
approach. Ethical design attempts to promote good 
through the creation of products that are made and 
consumed within a socially accepted moral frame- 
work. Victor Papanek (1985, p. 102) has focused on 
the ecological impact of products and has demanded 
a “high social and moral responsibility from the 
designer.” Whiteley (1999, p. 221) describes this 
evolution of ethical design as: “[Stretching] back to 
the mid-nineteenth century and forward to the present. 
However, just what it is that constitutes the ethical 
dimension has changed significantly over 150 years, 
and the focus has shifted from such concerns as the 
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virtue of the maker, through the integrity and aes- 
thetics of the object, to the role of the designer — and 
consumer — in a just society.” 

Unlike Morris’s arts and craft approach, engi- 
neering and science based design is often perceived 
as value free. Dunne (1999) quotes Bernard Waites 
to counter this apparent impartiality: “All 
problems... are seen as ‘technical’ problems ca- 
pable of rational solution through the accumulation 
of objective knowledge, in the form of neutral or 
value-free observations and correlations, and the 
application of that knowledge in procedures arrived 
at by trial and error, the value of which is to be judged 
by how well they fulfil their appointed ends. These 
ends are ultimately linked with the maximisation of 
society ’ s productivity and the most economic use of 
its resources, so that technology . . . .becomes ‘instru- 
mental rationality ’ inc arnate . . . .” 

HCI 

HCI applies scientific research to the design of user- 
interfaces. While many (e.g., Fogg, 2003) have 
promoted ethics in HCI, Cairns and Thimbleby (2003, 
p. 3) go further to indicate the similarities between 
the two: “HCI is a normative science that aims to 
improve usability. The three conventional normative 
sciences are aesthetics . . . ethics . . . and logic . Broadly, 
HCI’s approaches can be separated into these cat- 
egories: logic corresponds to formal methods in HCI 
and computer science issues; modern approaches, 
such as persuasive interfaces and emotional impact, 
are aesthetics; and the core body of HCI corre- 
sponds with ethics... HCI is about making the user 
experience good.” 

In promoting “good” HCI, ethics has concen- 
trated on professional issues and the impact of 
functionality, ownership, security, democracy, ac- 
cessibility, communication, and control. Friedman 
(2003) summarises this work as pertaining to: 

1 . Accountability 

2. Autonomy 

3. Calmness 

4. Environmental sustainability 

5. Freedom from bias 

6. Human welfare 

7. Identity 



8. Informed consent 

9. Ownership and property 

10. Privacy 

11. Trust 

12. Universal usability 

Guidelines are often used to communicate HCI 
ethics. Fogg (2003, pp. 233-234) provides guidelines 
for evaluating the ethical impact of persuasive com- 
puting. This requires researchers to: 

1 . List all stakeholders. 

2. List what each stakeholder has to gain. 

3. List what each stakeholder has to lose. 

4. Evaluate which stakeholder has the most to 
gain. 

5. Evaluate which stakeholder has the most to 
lose. 

6. Determine ethics by examining gains and losses 
in terms of values. 

7. Acknowledge the values and assumptions you 
bring to your analysis. 

Standards are a more mandatory form of ethical 
guidelines and prescribe processes, quality, and fea- 
tures. Compliance can be informal or through “de 
jure” agreements (e.g., International Organization 
for Standardization, 2000). Cairns and Thimbleby 
(2003, p. 15) offer a less stringent set of HCI ethical 
principles ethics comprising: 

1 . A rule for solving problems 

2. A rule for burden of proof 

3. A rule for common good 

4. A rule of urgency 

5. An ecological rule 

6. A rule of reversibility 

Citing Perry (1999) as evidence, Cairns and 
Thimbleby (2003, p. 15) imply that ethical rules are 
a poor substitute for knowledge: 

Students ...generally start from an absolutist 
position: ‘There is one right way to do HCI. ' This 
initial position matures through uncertainty, 
relativism, and then through stages of personal 
ownership and reflection. At the highest levels... a 
student makes a personal commitment to the 
particular ethical framework they have chosen 
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to undertake their work. However, it is a dynamic 
activity to develop any such framework in the 
complex and conflicting real world. 

Developing ethical knowledge requires under- 
standing and empathy with users. This becomes 
harder as the design goal shifts from usable user 
interface to engaging designed experience. Patrick 
Jordan’s (2000) “new human factors” describes an 
evolution of consumers where usability has moved 
from being a “satisfier” to a “dissatisfier.” This 
argument implies that it would be unethical to ignore 
users’ deeper wants and needs, which he maintains 
are based on pleasure and emotional benefit. An 
important advance from traditional HCI is Jordan’s 
emphasis on the product life cycle and the changing 
relationships that people have with their possessions. 
The emotions involved in buying a mobile telephone 
are different from those elicited though ownership 
and long-term use. The key difference with Jordan’s 
approach (to traditional HCI) is the importance placed 
on the benefit of the users’ interaction. 

As well as emotional interaction, another recent 
trend in HCI argues that interfaces are not just used 
but are experienced. Shedroff ’ s (200 1 , p. 4) model of 
experience design offers users an attraction, an 
engagement, and a conclusion. Experience design is 
about the whole experience of an activity. Activities 
occur in time and involve a number of agents, artefacts, 
and situations. Experiences are predicated by moti- 
vation and the reward of the conclusion. Combining 
Jordan’s emphasis on the emotional benefit of inter- 
action and Shedroff’ s model of experience, an alter- 
native set of design qualities are suggested: 

1 . Attraction 

2. Engagement 

3. Benefit 



EMOTIONAL ATTRACTION 

How are users attracted to an experience? In emo- 
tional design, Donald Norman (2003, p. 87) states 
that “attractiveness is a visceral level phenomenon — 
the response is entirely to the surface look of an 
object.” While showing how emotions are integral to 
cognition and decision-making, the design model he 
proposes (Visceral, Behavioural, and Reflective) 



diminishes the social and rational basis of emotion. 
Alternatively, Jordan (2000) suggests that attrac- 
tion is based on Lionel Tiger’s (1992) four plea- 
sures: 

1. Socio-pleasure 

2. Pyscho-pleasure 

3. Physio-pleasure 

4. Ideo-pleasure 

The focus on pleasure raises a number of ethical 
dilemmas. Epicurus suggests that pleasure and dis- 
pleasure need to be measured against their impact 
on well-being. Singer (1994, p. 188) cites Epicurus’ 
treatise on “The Pursuit of Pleasure”: 

Every pleasure then because of its natural 
kinship to us is good, yet not every pleasure is to 
be chosen: even as every pain also is an evil, yet 
not all are always of a nature to be avoided 
...For the good on certain occasions we treat as 
bad, and conversely the bad as good. 

EMOTIONAL ENGAGEMENT 

How are users emotionally engaged in an experi- 
ence? Laurel (1991, p. 1 13) applies the concept of 
engagement from Aristotle. Quinn (1997) uses en- 
gagement as a paradigm for learning applications. 
While synthetic experiences that seize the human 
intellect, emotions and senses have the potential for 
good; they could also be harmful. Computer games 
can be compulsive but they do not necessarily 
benefit the user? Learning can be boring but it is 
often the precursor to the reward of greater knowl- 
edge and experience. Indeed, Dejean (2002, pp. 
147-150) suggests that apparently unpleasant expe- 
riences, such as difficulty, challenge, and fatigue, 
can be rewarding in certain circumstances. 

EMOTIONAL BENEFIT 

What profit can users derive from an experience? 
Many researchers argue that users’ wants and 
needs are becoming more sophisticated. Users 
want more from products than value for money and 
usability. Products are not just used; they become 
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possessions and weave their way into the fabric of 
the peoples’ lives. Bonapace (2002, pp. 187-217) 
presents a hierarchical model of users’ expectations 
with safety and well-being, at ground level, function- 
ality, and then usability, leading up to an apex of 
pleasure. Dunne (1999) challenges the goal of us- 
ability, although his critique replaces usability with 
aesthetics, turning products into art. The foreword 
(1999, p. 7) to “Hertzian Tales: Electronic Products, 
Aesthetic Experiences and Critical Design” pro- 
poses that: 

The most difficult challenge for designers of 
electronic objects now lies not in technical and 
semiotic functionality, where optimal levels of 
performance are already attainable, but in the 
realms of metaphysics, poetry and aesthetics, 
where little research has been carried out. 

Aesthetic use-values may be of limited profit for 
people. Owning a beautiful painting can provide 
pleasure for the individual but may be of little benefit 
to them or society. In contrast, the goal of ethics, 
individual and social well-being, are universally ben- 
eficial. Bond (1996, p. 209) promotes self-interest as 
an intrinsically human goal: “Morality, we have 
said. . .is a value for all humanity, because it profits 
us, because it is a contribution, a necessary contribu- 
tion, to our thriving, flourishing, happiness, or well- 
being or eudaimonia.” 



FUTURE TRENDS 

There is a discernible trend in HCI away from the 
more “functional” aspects of interaction to the 
“suprafunctional”. As well as emotion and pleasure, 
there is a growing concern for traditional design 
issues of aesthetics. Ethics has a role to play in 
advancing inquiry into these areas but more impor- 
tantly provides an alternative perspective on the 
benefit and quality of the user experience. 

In focusing on well-being. Bond distinguishes 
between short-term and long-term hedonism. Fun is 
balanced against long-term well-being that often 
involves challenges including learning about the world 
and the self. This is echoed in the work of Ellis’ 
(1994) whose rational emotive therapy (RET) fo- 
cuses on self-understanding as a prerequisite to 



well-being. Sullivan (2000, pp. 2-4) suggests that 
well-being is based on three criteria: Antonovsky’s 
Sense Of Coherence (Antonovsky, 1993), Self- 
Esteem, and Emotional Stability (Hills & Argyle, 
2001). Alternatively, Hayakawa (1968, pp. 51-69) 
suggests personal qualities of fully-functioning hu- 
mans: 

1 . Nonconformity and individuality 

2. Self-awareness 

3. Acceptance of ambiguity and uncertainty 

4. Tolerance 

5. Acceptance of human animality 

6. Commitment and intrinsic enjoyment 

7. Creativity and originality 

8. Social interest and ethical trust 

9. Enlightened self-interest 

10. Self-direction 

1 1 . Flexibility and scientific outlook 

12. Unconditional self-acceptance 

13. Risk-taking and experimenting. 

14. Long-range hedonism. 

15. Work and practice 

An alternative to designing for hedonic use quali- 
ties is to try to build products that promote well- 
being. This can be shown as a hierarchy of user 
needs (Figure 1 ) that includes ethical and aspirational 
dimensions linked to design goals. Picard (1999) is 
one of the few designers and researchers to offer an 
insight into how products might facilitate well-being. 
Her “Affective Mirror” enables users to see how 
others see them and so enables them to change their 
behaviour and perception. 



Figure 1. An ethical framework for HCI design 
and research 
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CONCLUSION 

The goal of HCI research and design has been to 
deliver universal usability. Critics of usability have 
argued that it “dumbs down” the user-experience to 
the lowest common denominator. In contrast, re- 
searchers have recommended emotional design goals. 
These include pleasure (Jordan, 2000), emotion 
(Norman, 2003), and fun. While these design goals 
match a wider range of human capabilities, they also 
raise ethical issues. In contrast, the author suggests 
that the ultimate goal of HCI should be to promote 
the benefit of well-being through a value-centred 
design approach. 

REFERENCES 

Antonovsky, A. (1993). The structure and proper- 
ties of the Sense of Coherence Scale. Social Sci- 
ence and Medicine, 36, 725-733. 

Bonapace, L. (2002). Linking product properties to 
pleasure: The sensorial quality assessment method — 
SEQUAM. In W. S. Green & P. W. Jordan (Eds.), 
Pleasure with products: Beyond usability (pp. 
187-217). London: Taylor and Francis. 

Bond, E. J. (1996). Ethics and human well being: 
An introduction to moral philosophy. Cambridge, 
MA: Blackwell. 

Cairns, P., & Thimbleby, H. (2003). The diversity 
and ethics of HCI. Retrieved on April 5, 2004, from 
http:// www. uclic.ucl.ac. uk/har old/ethics/ 
tochiethics.pdf 

DeJean, P. H. (2002). Difficulties and pleasure? In 
W. S. Green & P. W. Jordan (Eds.), Pleasure with 
products: Beyond usability (pp. 147-150). Lon- 
don: Taylor and Francis. 

Dunne, A. (1999). Hertzian tales: Electronic prod- 
ucts, aesthetic experience and critical design. 
London: Royal College of Art. 

Eliot, B. (2002). Hiding behind the user. In Splked- 
IT. Retrieved on January 31, 2002, from http:// 
www.spiked-online.com/articles/00000002D3DE. 
htm. 



Ellis, A. (1994). Rational emotive behavior therapy 
approaches to obsessive-compulsive disorder. Jour- 
nal of Rational-Emotive & Cognitive-Behavior 
Therapy, 12(2), 121-141. 

Epicurus. (1994). The pursuit of pleasure. In P. 
Singer (Ed.), Ethics (p. 188). Oxford, UK: Oxford 
University Press. 

Fogg, B. (2003). Persuasive technology, using 
computer to change what we think and do. San 
Fransisco: Morgan Kaufmann. 

Friedman, B. & Kahn, P. (2003). Human values, 
ethics, and design. In J. Jacko & A. Sears (Eds.), 
The HCI handbook: Fundamentals, evolving tech- 
nologies and emerging applications (pp. 1177- 
1201). Mahwah, NJ: Lawrence Erlbaum Associ- 
ates. 

Hayakawa, S. I. (1968). The fully functioning per- 
sonality. In S. I. Hayakawa (Ed.), Symbol, status 
and personality (pp. 51-69). New York: Harcourt 
Brace Jovanovich. 

Hills, P., & Argyle, M. (2001). Emotional stability as 
a major dimension of happiness. Personality and 
Individual Differences, 31, 1357-1364. 

International Organization for Standardization (2000). 
ISO/IEC FDIS 9126-1: Software Engineering — 
Product quality — Part 1: Quality model. ISO. 
Retrieved on October 18, 2005, from http:// 
www. usability .serco.com/trump/resources/ 
standards. htm#9 126-1 

Jordan, P. W. (2000). Designing pleasurable prod- 
ucts. London: Taylor and Francis. 

Laurel, B. (1991). Computers as theatre. Boston: 
Addison-Wesley Publishing Company . 

Molotch, H. (2003). Where stuff comes from: How 
toasters, toilets, cars, computers and many other 
things come to be as they are. London: Routledge. 

Norman, D. (2003). Emotional design: Why we 
love [or hate] everyday things. New York: Basic 
Books. 

Papanek, V. (1985). Design for the real world. 
London: Thames and Hudson. 



203 



Ethics and HCI 



Perry, W. G. (1999). Forms of ethical and intellec- 
tual development in the college years. San Fran- 
cisco: Josey Bass. 

Picard, R. (1997). Affective computing. Cambridge, 
MA: MIT Press. 

Quinn, C. N. (1997). Engaging learning , instruc- 
tional technology forum. Retrieved on April 5, 
2004 from http ://i tech 1 .coe.uga.edu/itforum/paper 1 8/ 
paperl8. html 

Shedroff, N. (2001). Experience design 1 . India- 
napolis, IN: New Riders Publishing. 

Sullivan, S. (2000). The relations between emotional 
stability, self-esteem, and sense of coherence. In D. 
Brown (Ed.), Journal of Psychology and Behav- 
ioral Sciences. Department of Fairleigh Dickinson 
University at Madison, New Jersey. Retrieved Au- 
gust 18, 2004, from http://view.fdu.edu/ 
default. aspx?id=7 84 

Tiger, L. (1992). The pursuit of pleasure. NJ: 
Transaction Publishers. 

Whiteley , N. ( 1 999). Utility, design principles and the 
ethical tradition. In J. Attfield (Ed.), Utility reas- 
sessed, the role of ethics in the practice of design 
(p 221). Manchester, UK: Manchester University 
Press. 



KEY TERMS 

Design: The intentional creation of objects. 

Ethical Design: Attempts to promote good 
through the creation of products that are made and 
consumed within a socially-accepted moral frame- 
work. 

Ethics: The philosophy of morality. 

Experience Design: The intentional creation of 
a time-based activity that includes physical objects, 
agents, and situations. 

Universal Usability: Concerns research and 
design activities that enable interfaces to be ac- 
cessed and used by all users. 

Value-Centred Design: An approach to design 
that involves explicating stakeholder (including de- 
signers and developers) values as well as needs. 
Design then aims to communicate and deliver prod- 
ucts and services that meet stakeholders values and 
needs. 
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INTRODUCTION 

Technology with its continuing developments per- 
vades the 21 st century world. Consequently, HCI is 
becoming an everyday activity for an increasing 
number of people from across the population. Inter- 
actions may involve personal computers (PCs), 
household and domestic appliances, public access 
technologies, personal digital assistants (PDAs), as 
well as more complex technologies found in the 
workplace. Given the increasing use of technology 
by the general public, HCI assumes an ever-growing 
importance. User interactions need to be taken into 
account by designers and engineers ; if they fail to do 
this, the opportunities presented by the new tech- 
nologies will remain unfulfilled and unrealized. Fur- 
thermore, it is likely that those interactions that take 
place will be marred by frustration and irritation, as 
users fail to achieve the smooth transactions with the 
technology that they expect and desire. One aspect 
of HCI that appears to have been recently over- 
looked is that of expectations. When confronted 
with a new device or when using a familiar one, we 
have expectations about how it will or does work. 
These expectations are part of the interaction pro- 
cess and are important in the sense that they will 
influence our immediate and later use of the technol- 
ogy/device. It is suggested that in recent times we 
have neglected expectations and failed to consider 
them to any great extent in the design process. 

BACKGROUND 

Fifty years ago, expectations were recognized as 
having a role to play in human-machine interactions 
and the design of products. One of the first studies 
was concerned with the design of telephones (Lutz 
& Chapanis, 1955). At this time, Chapanis was 
working at Bell Laboratories, when a project was 
initiated to develop a telephone operated via a push- 



button keyset as opposed to a rotary dial. At that 
time, there was only one existing push-button model — 
the toll operators’ keyset. The numerical part of this 
keyset comprised two vertical columns of five keys 
each. The first column included 4, 3, 2, 1, 0, while the 
second column held the numerals, 5, 6, 7, 8, 9. 

However, when Chapanis studied the toll opera- 
tors at work, he found that lots of miskeying was 
occurring. Although it was illegal to listen to calls, it 
was possible to do service observing where the 
number requested by the caller could be checked 
against the number dialed. It was found that 13% of 
long distance calls were being incorrectly dialed. In 
essence, the toll operators expected the numbers to 
be elsewhere. This led Chapanis to devise a study on 
the expected locations of numbers on keysets. The 
investigation had three aims: namely, to find out 
where people expected to find numbers and then 
letters on each of six configurations of 10 keys, and 
where they expected to find letters, given certain 
preferred number arrangements. In a questionnaire 
study, 300 participants filled in numbers/letters on 
diagrams of keysets according to the arrangements 
that they felt were the most natural. Analysis of this 
data allowed Lutz and Chapanis (1955) to deduce 
where people expected to find the various alphanu- 
meric characters on a keyset. 

In the 1950s, the discipline of HCI as we know it 
today did not exist; it was not until the launch of the 
first PCs that people began to recognize HCI as a 
distinct entity. Consequently, by the mid-1980s, there 
was a lot of information available on HCI for the 
designers of computer systems and products. One of 
the most comprehensive sources was the 944 guide- 
lines for designing user interface software compiled 
by Smith and Mosier (1986). This 287-page report is 
still available online at: http://www.hcibib.org/sam/. 
Given the detail and breadth of this report, it is 
somewhat surprising that the topic of expectations is 
mentioned rarely; it is only referred to on five 
occasions, which are listed as follows: 
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Section on Flowcharts 

1. “Flowchart coding within to established con- 
ventions and user expectations.” 

2. . survey prospective users to determine just 
what their expectations may be.” 

3. Section on Compatibility with User Expecta- 
tions 

4. "... control entry are compatible with user 
expectations ” 

5. “User expectations can be discovered by in- 
terview, questionnaire, and/or prototype test- 
ing.” 

6. Section on System Load 

7. “But load status information may help in any 
case by establishing realistic user expecta- 
tions for system performance.” 

The guidelines suggest that the system should 
conform to established conventions and user expec- 
tations, and the way in which we find out about these 
expectations is through surveys, interviews, ques- 
tionnaires, and prototype testing. However, the fol- 
lowing comment implies that consistency in design is 
all that is needed: 

Where no strong user expectations exist with 
respect to a particular design feature, then 
designers can help establish valid user 
expectations by careful consistency in interface 
design. (Section 3.0/16) 

Given the relative wealth of information available 
on HCI in the 1980s compared to earlier years, it is 
somewhat surprising that expectations have not 
received more emphasis and interest. Lutz and 
Chapanis (1955) were certainly aware of their im- 
portance in their study, but despite the plethora of new 
technological developments since then, this aspect of 
HCI seems generally to have been forgotten. 

DEFINING EXPECTATIONS 

In terms of HCI, expectations relate to how we 
expect a product/system to respond and react when 
we use it. At one level, our expectations are an 
automatic response (e.g., when we perceive visual 
stimuli). Lor example, when we look at perceptual 



illusions, our past experience and knowledge of the 
properties of straight lines and circles in our environ- 
ment determines how we perceive the objects. This 
may help to explain how we perceive straight lines as 
bending, circles as moving, and so forth. 

At another level, expectations and beliefs are 
powerful forces in shaping our attitudes and behav- 
ior (e.g., schema and scripts). These are socially 
developed attributes that determine how we behave 
and what is appropriate/inappropriate behavior in a 
particular context. In both of these examples, the 
environment in which we have been nurtured and the 
corresponding culture will have a major influence. 
As an example, straight lines are very much a 
feature of the human-made world and do not exist in 
nature, so perceptual differences would be expected 
between those individuals living in a city and those 
living in the jungle. 

The common feature in both of these examples is 
that expectations are based on past experience and 
knowledge; they also are quite powerful determi- 
nants of how we behave, both on an automatic level 
(i.e., the perceptual processing of information) and 
at a behavioral level. These are important consider- 
ations in HCI, and because of this, they need to be 
taken into account in the design process. 

FUTURE TRENDS 
Population Stereotypes 

One example of the cultural influence on design is 
the population stereotype. These stereotypes are 
everyday artefacts that have strong associations 
(e.g., the way in which color is used). The color red 
suggests danger and is often used to act as a warning 
about a situation. Lor example, traffic lights when 
red warn vehicles of danger and the need to stop at 
the signal. Likewise, warnings on the civil flight deck 
are colored red, while cautions are amber. Often, 
these stereotypes are modeled in nature; for ex- 
ample, berries that are red warn animals not to eat 
them, as their color implies they are poisonous. 

Population stereotypes are determined culturally 
and will differ between countries. They also change 
over time. Lor example, the current stereotype in 
Europe and North America is for baby boys’ clothes 
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to be associated with the color blue and baby girls’ 
clothes to be pink. However, this has not always been 
the case. In America, up until the 1940s, this stereo- 
type was the opposite, as shown by the following two 
quotations: 

[U]se pink for the boy and blue for the girl, if you 
are a follower of convention. ( Sunday Sentinal, 
March 1914) 

[T] he generally accepted rule is pink for the boy 
and blue for the girl. The reason is that pink being 
a more decided and stronger color is more suitable 
for the boy, while blue, which is more delicate and 
dainty, is pertier for the girl. ( Ladies Home 
Journal, June 1918) 

Most people would probably agree that blue for a 
boy and pink for a girl is a strong population stereo- 
type in the UK/North America, and today, it would be 
quite unusual for a young male to be dressed in pink. 
Yet, this stereotype has developed only recently. 

Consistency 

A further consideration is that of consistency (i.e. , a 
match between what people expect and what they 
perceive when actually using the product/system). 
Smith and Mosier (1986) alluded to this in their 
guidelines, although they referred to consistency in 
interface design, which could be interpreted as being 
consistent in terms of controls, functions, move- 
ments, and so forth when using the software (e.g., the 
same command is always used for exit). In design 
terms, consistency also can be interpreted as com- 
patibility (i.e., compatibility between what we expect 
and what we get). One of the classic studies demon- 
strating the importance of consistency/compatibility 
was carried out by Chapanis and Lindenbaum (1959). 
They looked at four arrangements of controls and 
burners as found, for example, on electric/gas stoves. 
They found that the layout, which had the greatest 
spatial compatibility between the controls and the 
burners, was superior in terms of speed and accu- 
racy. This study has been repeated many times (Hsu 
& Peng, 1993; Osborne & Ellingstad, 1987; Payne, 
1995; Ray & Ray, 1979; Shinar & Acton, 1978). 



Past Experience 

The role of past experience and knowledge in 
developing our expectations has already been men- 
tioned; these are viewed as important determinants 
in our attitudes toward technology, as demonstrated 
by the following study. Noyes and Starr (2003) 
carried out a questionnaire survey of British Air- 
ways flight deck crew in order to determine their 
expectations and perceptions of automated warn- 
ing systems. The flight deck crew was divided into 
those who had experience flying with automated 
warning systems (n=607) and those who did not 
have this experience (n=571). It was found that 
automation was more favored by the group who had 
experience flying using warning systems with a 
greater degree of automation. Those without expe- 
rience of automation had more negative expecta- 
tions than those who had experienced flying with 
more automated systems. This suggests that we are 
more negative toward something that we have not 
tried. These findings have implications for HCI, 
training, and the use of new technology; in particu- 
lar, where there already exists a similar situation 
with which the individual is familiar. Pen- and 
speech-based technologies provide examples of 
this; most of the adult population has expectations 
about using these emerging technologies based on 
their experiences and knowledge of the everyday 
activities of writing and talking. 

Noyes, Frankish, and Morgan (1995) surveyed 
people’s expectations of using a pen-based system 
before they had ever used one. They found that 
individuals based their expectations on using pen 
and paper; for example, when asked to make a 
subjective assessment of the sources of errors, 
most people thought these would arise when input- 
ting the lower-case characters. This is understand- 
able, since if we want to write clearly, we tend to do 
this by writing in capitals rather than lower-case. 
However, the pen recognition systems are not 
working on the same algorithms as people, so there 
is no reason why errors are more likely to occur 
with lower-case letters. In fact, this was found to be 
the case. When participants were asked about their 
perceptions after having used the pen-based sys- 
tem, they indicated the upper-case letters as being 
the primary source of errors. 
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None of the participants in this work had previ- 
ously used a pen-based system; perhaps lack of 
familiarity with using the technology was a signifi- 
cant factor in the mismatch between their expecta- 
tions and perceptions. If this was the case, eliciting 
people’s expectations about an activity with which 
they are familiar should not result in a mismatch. 
This aspect recently has been investigated by the 
author in a study looking at people’s expectations/ 
preconceptions of their performance on activities 
with which they are familiar. When asked about 
their anticipated levels of performance when using 
paper and computers, participants expected to do 
better on the computer-based task. After carrying 
out the equivalent task on paper and computer, it was 
found that actual scores were higher on paper; thus, 
people’s task performance did not match their ex- 
pectations. 

Intuitiveness 

An underlying aspect relating to expectations in HCI 
is naturalness (i.e., we expect our interactions with 
computers and technology to be intuitive). In terms 
of compatibility, designs that are consistent with our 
expectations (e.g., burners that spatially map onto 
their controls) could be perceived as being more 
natural/intuitive. Certainly, in the Chapanis and 
Lindenbaum (1959) study, participants performed 
best when using the most compatible design. It is 
reasonable, therefore, to anticipate that intuitive 
designs are more beneficial than those that are not 
intuitive or counter-intuitive, and that HCI should 
strive for naturalness. 

However, there are two primary difficulties as- 
sociated with intuitive design. With emerging tech- 
nologies based on a primary human activity, intu- 
itiveness may not always be desirable. Take, for 
example, the pen-based work discussed previously. 
People’s expectations did not match their percep- 
tions, which was to the detriment of the pen-based 
system (i.e., individuals had higher expectations). 
This was primarily because people, when confronted 
with a technology that emulates a task with which 
they are familiar, base their expectations on this 
experience — in this case, writing. A similar situation 
occurs with automatic speech recognition (ASR) 
interfaces. We base our expectations about using 
ASR on what we know about human-human com- 



munications, and when the recognition system does 
not recognize an utterance, we speak more loudly, 
more slowly, and perhaps in a more exaggerated 
way. This creates problems when we use ASR, 
because the machine algorithms are not working in 
the same way as we do when not understood by a 
fellow human. A further irony is that the more 
natural our interaction with the technology is, the 
more likely we are to assume we are not talking to 
a machine and to become less constrained in our 
dialogue (Noyes, 2001). If perfect or near-perfect 
recognition was achievable, this would not be a 
problem. Given that this is unlikely to be attained in 
the near future, a less natural interaction, where the 
speaker constrains his or her speech, will achieve 
better recognition performance. 

In addition to intuitiveness not always being de- 
sirable, humans are adaptive creatures, and they 
adapt to non-intuitive and counter-intuitive designs. 
Take, for example, the standard keyboard with the 
QWERTY layout. This was designed in the 1860s as 
part of the Victorian typewriter and has been shown 
repeatedly to be a poor design (Noyes, 1998); how- 
ever, it now dominates computer keyboards around 
the world. There have been many keyboards devel- 
oped that fit the shape of the hands and fingertips, 
but these more natural keyboards have been unsuc- 
cessful in challenging the supremacy of QWERTY. 
Likewise, most hob designs still have incompatible 
control-burner linkages, but users adapt to them. We 
also learn to use counter-intuitive controls (e.g., car 
accelerator and brake pedal controls) that have 
similar movements but very dissimilar results. This 
ability to adapt to designs that are not intuitive 
further weakens the case for naturalness; however, 
this approach would have considerable human costs, 
and it would be unreasonable to support the notion 
that poor design is admissible on these grounds. 

CONCLUSION 

Expectations are an important aspect of human 
behavior and performance and, in this sense, need to 
be considered in HCI. A mismatch between expec- 
tations and perceptions and/or ignoring people’s 
expectations could lead them to feel frustrated and 
irritated, and to make unnecessary mistakes. Finding 
out about people’s expectations is achieved readily, 
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as demonstrated by the studies mentioned here, and 
will bring benefits not only to the design process but 
also to training individuals to use new technology. 

Expectations are linked intrinsically with natural- 
ness and intuitiveness expressed through consis- 
tency and compatibility in design. However, intu- 
itiveness in design is not necessarily always desir- 
able. This is especially the case when technology is 
being used to carry out familiar everyday tasks (e.g., 
writing and speaking). The situation is further com- 
pounded by the ability of the human to adapt, in 
particular, to counter-intuitive designs. 
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KEY TERMS 

Compatibility: Designs that match our expecta- 
tions in terms of characteristics, function, and opera- 
tion. 

Consistency: Similar to compatibility and some- 
times used interchangeably; designs that match our 
expectations in terms of characteristics, function, 
and operation, and are applied in a constant manner 
within the design itself. 

Expectations: In HCI terms, how we expect 
and anticipate a product/system to respond and 
react when we use it. 

Intuitiveness: Knowing or understanding im- 
mediately how a product/system will work without 
reasoning or being taught. Intuitiveness is linked 
closely to naturalness (i.e., designs that are intuitive 
also will be perceived as being natural). 

Perceptions: In HCI terms, how we perceive a 
product/system to have responded and reacted when 
we have used it. Ideally, there should be a good 
match between expectations and perceptions. 

Perceptual Illusions: Misperceptions of the 
human visual system so that what we apprehend by 
sensation does not correspond with the way things 
really are. 
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Population Stereotypes: These comprise the 
well-ingrained knowledge that we have about the 
world, based on our habits and experiences of living 
in a particular cultural environment. 



Schema and Scripts: Mental structures that 
organize our knowledge of the world around specific 
themes or subjects and guide our behavior so that we 
act appropriately and according to expectation in 
particular situations. 
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INTRODUCTION 

Eye tracking is a technique whereby an individual’ s 
eye movements are measured so that the researcher 
knows both where a person is looking at any given 
time and the sequence in which the person’s eyes 
are shifting from one location to another. Tracking 
people’s eye movements can help HCI researchers 
to understand visual and display-based information 
processing and the factors that may impact the 
usability of system interfaces. In this way, eye- 
movement recordings can provide an objective source 
of interface-evaluation data that can inform the 
design of improved interfaces. Eye movements also 
can be captured and used as control signals to enable 
people to interact with interfaces directly without 
the need for mouse or keyboard input, which can be 
a major advantage for certain populations of users, 
such as disabled individuals. We begin this article 
with an overview of eye-tracking technology and 
progress toward a detailed discussion of the use of 
eye tracking in HCI and usability research. A key 
element of this discussion is to provide a practical 
guide to inform researchers of the various eye- 
movement measures that can be taken and the way 
in which these metrics can address questions about 
system usability. We conclude by considering the 
future prospects for eye-tracking research in HCI 
and usability testing. 

BACKGROUND 

The History of Eye Tracking 

Many different methods have been used to track eye 
movements since the use of eye-tracking technology 
first was pioneered in reading research more than 
100 years ago (Rayner & Pollatsek, 1989). Electro- 



oculographic techniques, for example, relied on elec- 
trodes mounted on the skin around the eye that could 
measure differences in electrical potential in order 
to detect eye movements. Other historical methods 
required the wearing of large contact lenses that 
covered the cornea (the clear membrane covering 
the front of the eye) and sclera (the white of the eye 
that is seen from the outside) with a metal coil 
embedded around the edge of the lens; eye move- 
ments then were measured by fluctuations in an 
electromagnetic field when the metal coil moved 
with the eyes (Duchowski, 2003). These methods 
proved quite invasive, and most modern eye-track- 
ing systems now use video images of the eye to 
determine where a person is looking (i.e., their so- 
called point-of-regard). Many distinguishing fea- 
tures of the eye can be used to infer point-of-regard, 
such as corneal reflections (known as Purkinje 
images), the iris-sclera boundary, and the apparent 
pupil shape (Duchowski, 2003). 

How Does an Eye Tracker Work? 

Most commercial eye trackers that are available 
today measure point-of-regard by the corneal-re- 
flection/pupil-center method (Goldberg & Wichansky , 
2003). These kinds of trackers usually consist of a 
standard desktop computer with an infrared camera 
mounted beneath (or next to) a display monitor, with 
image processing software to locate and identify the 
features of the eye used for tracking. In operation, 
infrared light from an LED embedded in the infrared 
camera first is directed into the eye to create strong 
reflections in target eye features to make them 
easier to track (infrared light is used to avoid dazzling 
the user with visible light). The light enters the retina, 
and a large proportion of it is reflected back, making 
the pupil appear as a bright, well defined disc (known 
as the bright-pupil effect). The corneal reflection (or 
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Figure 1. Corneal reflection and bright pupil as 
seen in the infrared camera image 




Bright pupil Corneal reflection 



first Purkinje image) is also generated by the infra- 
red light, appearing as a small but sharp glint (see 
Figure 1). 

Once the image processing software has identi- 
fied the center of the pupil and the location of the 
corneal reflection, the vector between them is mea- 
sured, and, with further trigonometric calculations, 
point-of-regard can be found. Although it is possible 
to determine approximate point-of-regard by the 
corneal reflection alone (as shown in Figure 2), by 
tracking both features, eye movements critically can 
be disassociated from head movements (Duchowski, 
2003, Jacob & Karn, 2003). 

Video-based eye trackers need to be fine-tuned 
to the particularities of each person’s eye move- 
ments by a calibration process. This calibration 
works by displaying a dot on the screen, and if the 
eye fixes for longer than a certain threshold time and 
within a certain area, the system records that pupil- 
center/corneal-reflection relationship as correspond- 
ing to a specific x,y coordinate on the screen. This is 
repeated over a nine- to 1 3-point grid pattern to gain 
an accurate calibration over the whole screen 
(Goldberg & Wichansky, 2003). 

Why Study Eye Movements in HCI 
Research? 

What a person is looking at is assumed to indicate the 
thought “on top of the stack” of cognitive processes 



(Just & Carpenter, 1976). This eye-mind hypothesis 
means that eye movement recordings can provide a 
dynamic trace of where a person’s attention is being 
directed in relation to a visual display. Measuring 
other aspects of eye movements, such as fixations 
(i.e., moments when the eyes are relatively station- 
ary, taking in or encoding information), also can 
reveal the amount of processing being applied to 
objects at the point-of-regard. In practice, the pro- 
cess of inferring useful information from eye-move- 
ment recordings involves the HCI researcher defin- 
ing areas of interest over certain parts of a display or 
interface under evaluation and analyzing the eye 
movements that fall within such areas. In this way, 
the visibility, meaningfulness, and placement of spe- 
cific interface elements can be evaluated objec- 
tively, and the resulting findings can be used to 
improve the design of the interface (Goldberg & 
Kotval, 1999). For example, in a task scenario where 
participants are asked to search for an icon, a 
longer-than-expected gaze on the icon before even- 
tual selection would indicate that it lacks meaning- 
fulness and probably needs to be redesigned. A 
detailed description of eye-tracking metrics and 
their interpretation is provided in the following sec- 
tions. 



EYE TRACKING AS A RESEARCH 
AND USABILITY-EVALUATION TOOL 

Previous Eye-Tracking Research 

Mainstream psychological research has benefited 
from studying eye movements, as they can provide 
insight into problem solving, reasoning, mental imag- 
ery, and search strategies (Ball et al., 2003; Just & 
Carpenter, 1976; Yoon & Narayanan, 2004; Zelinsky 
& Sheinberg, 1995). Because eye movements pro- 
vide a window into so many aspects of cognition, 



Figure 2. Corneal reflection position changing according to point-of-regard (Redline & Lankford, 2001) 




Directed below the camera Directed at the camera 



Directed down and to 
the right of the camera 
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there also are rich opportunities for the application of 
eye-movement analysis as a usability research tool in 
HCI and related disciplines such as human factors 
and cognitive ergonomics. Although eye-movement 
analysis is still very much in its infancy in HCI and 
usability research, issues that increasingly are being 
studied include the nature and efficacy of information 
search strategies with menu-based interfaces 
(Altonenetal., 1998; Byrne et al., 1999; Hendrickson, 
1989) and the features of Web sites that correlate 
with effective usability (Cowen et al. , 2002 ; Goldberg 
et al., 2002; Poole et al., 2004). Additionally, eye 
trackers have been used more broadly in applied 



human factors research to measure situation aware- 
ness in air-traffic-control training (Hauland, 2003) 
in order to evaluate the design of cockpit controls to 
reduce pilot error (Hanson, 2004) and to investigate 
and improve doctors’ performances in medical pro- 
cedures (Law et al., 2004; Mello-Thoms et al., 
2002). The commercial sector also is showing in- 
creased interest in the use of eye-tracking technol- 
ogy in areas such as market research, for example, 
to determine what advert designs attract the great- 
est attention (Lohse, 1997) and to determine if 
Internet users look at banner advertising on Web 
sites (Albert, 2002). 



E 



Table 1. Fixation-derived metrics and how they can be interpreted in the context of interface design 
and usability evaluation (references are given to examples of studies that have used each metric) 



Eye- 

Movement 

Metric 


What it Measures 


Reference 


Number of 

fixations 

overall 


More overall fixations indicate less efficient 
search (perhaps due to sub-optimal layout of 
the interface). 


Goldberg and 
Kotval (1999) 


Fixations per 
area of interest 


More fixations on a particular area indicate 
that it is more noticeable, or more important, 
to the viewer than other areas. 


Poole et al. 
(2004) 


Fixations per 
area of interest 
and adjusted 
for text length 


If areas of interest are comprised of text only, 
then the mean number of fixations per area of 
interest can be divided by the mean number of 
words in the text. This is a useful way to 
separate out a higher fixation count, simply 
because there are more words to read, from a 
higher fixation count because an item is 
actually more difficult to recognize. 


Poole et al. 
(2004) 


Fixation 

duration 


A longer fixation duration indicates difficulty 
in extracting information, or it means that the 
object is more engaging in some way. 


Just and 

Carpenter 

(1976) 


Gaze (also 
referred to as 
dwell, fixation 
cluster, and 
fixation cycle) 


Gaze is usually the sum of all fixation 
durations within a prescribed area. It is best 
used to compare attention distributed between 
targets. It also can be used as a measure of 
anticipation in situation awareness, if longer 
gazes fall on an area of interest before a 
possible event occurring. 


Mello-Thoms 
et al. (2004); 
Hauland 
(2003) 


Fixation 
spatial density 


Fixations concentrated in a small area indicate 
focused and efficient searching. Evenly spread 
fixations reflect widespread and inefficient 
search. 


Cowen et al. 
(2002) 


Repeat 

fixations (also 
called post- 
target 
fixations) 


Higher numbers of fixations off target after 
the target has been fixated indicate that it 
lacks meaningfulness or visibility. 


Goldberg and 
Kotval (1999) 


Time to first 
fixation on 
target 


Faster times to first fixation on an object or 
area mean that it has better attention-getting 
properties. 


Byrne et al. 
(1999) 


Percentage of 
participants 
fixating on an 
area of interest 


If a low proportion of participants is fixating 
on an area that is important to the task, it may 
need to be highlighted or moved. 


Albert (2002) 


On target (all 

target 

fixations) 


Fixations on target divided by total number of 
fixations. A lower ratio indicates lower search 
efficiency. 


Goldberg and 
Kotval (1999) 
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Eye-Movement Metrics 

The main measurements used in eye-tracking re- 
search are fixations (described previously) and sac- 
cades, which are quick eye movements occurring 
between fixations. There is also a multitude of 
derived metrics that stem from these basic mea- 
sures, including gaze and scanpath measurements. 
Pupil size and blink rate also are studied. 

Fixations 

Fixations can be interpreted quite differently, de- 
pending on the context. In an encoding task (e.g., 
browsing a Web page), higher fixation frequency on 
a particular area can be indicative of greater interest 
in the target (e.g., a photograph in a news report), or 
it can be a sign that the target is complex in some 
way and more difficult to encode (Jacob & Karn, 
2003; Just & Carpenter, 1976). However, these 
interpretations may be reversed in a search task — 
a higher number of single fixations, or clusters of 
fixations, are often an index of greater uncertainty in 
recognizing a target item (Jacob & Karn, 2003). The 
duration of a fixation also is linked to the processing 
time applied to the object being fixated (Just & 
Carpenter, 1976). It is widely accepted that external 
representations associated with long fixations are 
not as meaningful to the user as those associated 
with short fixations (Goldberg & Kotval, 1999). 
Fixation-derived metrics are described in Table 1. 



Saccades 

No encoding takes place during saccades, so they 
cannot tell us anything about the complexity or 
salience of an object in the interface. However, 
regressive saccades (i.e., backtracking eye move- 
ments) can act as a measure of processing difficulty 
during encoding (Rayner & Pollatsek, 1989). Al- 
though most regressive saccades (or regressions) 
are very small, only skipping back two or three 
letters in reading tasks; much larger phrase-length 
regressions can represent confusion in higher-level 
processing of the text (Rayner & Pollatsek, 1989). 
Regressions equally could be used as a measure of 
recognition value, in that there should be an inverse 
relationship between the number of regressions and 
the salience of the phrase. Saccade-derived metrics 
are described in Table 2. 

Scanpaths 

Describes a complete saccade-fixate-saccade se- 
quence. In a search task, typically an optimal scan 
path is viewed as being a straight line to a desired 
target with a relatively short fixation duration at the 
actual target (Goldberg & Kotval, 1999). Scanpaths 
can be analyzed quantitatively with the derived 
measures described in Table 3. 



Table 2. Saccade-derived metrics and how they can be interpreted in the context of interface design 
and usability evaluation (references are given to examples of studies that have used each metric) 



Eye-Movement 

Metric 


What it Measures 


Reference 


Number of 
saccades 


More saccades indicate more searching. 


Goldberg and 
Kotval (1999) 


Saccade amplitude 


Larger saccades indicate more 
meaningful cues, as attention is drawn 
from a distance. 


Goldberg et al. 
(2002) 


Regressive 
saccades (i.e., 
regressions) 


Regressions indicate the presence of less 
meaningful cues. 


Sibert et al. 
(2000) 


Saccades revealing 
marked directional 
shifts 


Any saccade larger than 90 degrees from 
the saccade that preceded it shows a 
rapid change in direction. This could 
mean that the user’s goals have changed 
or the interface layout does not match the 
user’s expectations. 


Co wen et al. 
(2002) 
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Table 3. Scanpath-derived metrics and how they can be interpreted in the context of interface design 
and usability evaluation (references are given to examples of studies that used each metric) 



E 



Eye-Movement 

Metric 


What it Measures 


Reference 


Scanpath duration 


A longer-lasting scanpath indicates less 
efficient scanning. 


Goldberg and 
Kotval (1999) 


Scanpath length 


A longer scanpath indicates less 
efficient searching (perhaps due to a 
suboptimal layout). 


Goldberg et al. 
(2002) 


Spatial density 


Smaller spatial density indicates more 
direct search. 


Goldberg and 
Kotval (1999) 


Transition matrix 


The transition matrix reveals search 
order in terms of transitions from one 
area to another. Scanpaths with an 
identical spatial density and convex hull 
area can have completely different 
transition values — one is efficient and 
direct, while the other goes back and 
forth between areas, indicating 
uncertainty. 


Goldberg and 
Kotval (1999); 
Hendricson 
(1989) 


Scanpath regularity 


Once cyclic scanning behavior is 
defined, and then deviation from a 
normal scanpath can indicate search 
problems due to lack of user training or 
bad interface layout. 


Goldberg and 
Kotval (1999) 


Spatial coverage 
calculated with 
convex hull area 


Scanpath length plus convex hull area 
define scanning in a localized or larger 
area. 


Goldberg and 
Kotval (1999) 


Scanpath direction 


This can determine a participant’ s 
search strategy with menus, lists, and 
other interface elements (e.g., top-down 
vs. bottom-up scanpaths). Sweep 
denotes a scanpath progressing in the 
same direction. 


Altonen et al. 
(1998) 


Saccade/ fixation 
ratio 


This compares time spent searching 
(saccades) to time spent processing 
(fixating). A higher ratio indicates more 
processing or less searching. 


Goldberg and 
Kotval (1999) 



Blink Rate and Pupil Size 

Blink rate and pupil size can be used as an index of 
cognitive workload. A lower blink rate is assumed to 
indicate a higher workload, and a higher blink rate 
may indicate fatigue (Brookings, Wilson, & Swain, 
1996; Bruneau, Sasse & McCarthy, 2002). Larger 
pupils also may indicate more cognitive effort 
(Pomplun & Sunkara, 2003). However, pupil size 
and blink rate can be determined by many other 
factors (e.g., ambient light levels), so they are open 
to contamination (Goldberg & Wichansky, 2003). 
For these reasons, pupil size and blink rate are used 
less often in eye tracking research. 



Technical Issues in 
Eye-Tracking Research 

Experimenters looking to conduct their own eye- 
tracking research should bear in mind the limits of 
the technology and how these limits impact the data 
that they will want to collect. For example, they 
should ensure that if they are interested in analyzing 
fixations that the equipment is optimized to detect 
fixations (Karn et al., 2000). The minimum time for 
a fixation is also highly significant. Interpretations of 
cognitive processing can vary dramatically accord- 
ing to the time set to detect a fixation in the eye- 
tracking system. Researchers are advised to set the 
lower threshold to at least 100ms (Inhoff & Radach, 
1998). 
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Researchers have to work with limits of accu- 
racy and resolution. A sampling rate of 60hz is good 
enough for usability studies but inadequate for read- 
ing research, which requires sampling rates of around 
500hz or more (Rayner & Pollatsek, 1989). It is also 
imperative to define areas of interest that are large 
enough to capture all relevant eye movements. Even 
the best eye trackers available are only accurate to 
within one degree of actual point-of-regard (Byrne 
et al., 1999). Attention also can be directed up to one 
degree away from measured point-of-regard with- 
out moving the eyes (Jacob & Karn, 2003). 

Eye trackers are quite sensitive instruments and 
can have difficulty tracking participants who have 
eyewear that interrupts the normal path of a reflec- 
tion, such as hard contact lenses, bifocal and trifocal 
glasses, and glasses with super-condensed lenses. 
There also may be problems tracking a person with 
very large pupils or a lazy eye such that the person’ s 
eyelid obscures part of the pupil and makes it 
difficult to identify. Once a person is calibrated 
successfully, the calibration procedure then should 
be repeated at regular intervals during a test session 
in order to maintain an accurate point-of-regard 
measurement. 

There are large differences in eye movements 
between participants on identical tasks, so it is 
prudent to use a within-participants design in order 
to make valid performance comparisons (Goldberg 
& Wichansky, 2003). Participants also should have 
well-defined tasks to carry out (Just & Carpenter, 
1976) so that their eye movements can be attributed 
properly to actual cognitive processing. Visual dis- 
tractions (e.g., colorful or moving objects around the 
screen or in the testing environment) also should be 
eliminated, as these inevitably will contaminate the 
eye-movement data (Goldberg & Wichansky, 2003). 
Finally, eye tracking generates huge amounts of 
data, so it is essential to perform filtering and analy- 
sis automatically, not only to save time but also to 
minimize the chances of introducing errors through 
manual data processing. 

EYE TRACKING AS AN 
INPUT DEVICE 

Eye movements can be measured and used to enable 
an individual actually to interact with an interface. 



Users could position a cursor by simply looking at 
where they want it to go or click an icon by gazing at 
it for a certain amount of time or by blinking. The first 
obvious application of this capability is for disabled 
users who cannot make use of their hands to control 
a mouse or keyboard (Jacob & Karn, 2003). How- 
ever, intention often can be hard to interpret; many 
eye movements are involuntary, leading to a certain 
Midas Touch (see Jacob & Karn, 2003), in that you 
cannot look at anything without immediately activat- 
ing some part of the interface. One solution to this 
problem is to use eye movements in combination 
with other input devices to make intentions clear. 
Speech commands can add extra context to users’ 
intentions when eye movements may be vague, and 
vice versa (Kaur et al., 2003). 

Virtual reality environments also can be con- 
trolled by the use of eye movements. The large 
three-dimensional spaces in which users operate 
often contain faraway objects that have to be ma- 
nipulated. Eye movements seem to be the ideal tool 
in such a context, as moving the eyes to span long 
distances requires little effort compared with other 
control methods (Jacob & Karn, 2003). Eye move- 
ment interaction also can be used in a subtler way 
(e.g., to trigger context-sensitive help as soon as a 
user becomes confused by performing too many 
regressions, for example, or while reading text [Sibert 
et al., 2000]). Other researchers (Ramloll et al., 2004) 
have used gaze-based interaction to help autistic 
children learn social skills by rewarding them when 
they maintain eye contact while communicating. 

Some techniques alter a display, depending on 
the point-of-regard. Some large-display systems, 
such as flight simulators (Levoy & Whitaker, 1990; 
Tong & Fisher, 1984), channel image-processing 
resources to display higher-quality or higher-resolu- 
tion images only within the range of highest visual 
acuity (i.e., the fovea) and decrease image process- 
ing in the visual range where detail cannot be 
resolved (the parafovea). Other systems (Triesch, 
Sullivan, Hayhoe & Ballard, 2002) take advantage of 
the visual suppression during saccades to update 
graphical displays without the user noticing. Yet 
another rather novel use is tracking the point-of- 
regard during videoconferencing and warping the 
image of the eyes so that they maintain eye contact 
with other participants in the meeting (Jerald & 
Daily, 2002). 
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FUTURE TRENDS IN EYE TRACKING 

Future developments in eye tracking should center 
on standardizing what eye-movement metrics are 
used, how they are referred to, and how they should 
be interpreted in the context of interface design 
(Cowen et al., 2002). For example, no standard 
exists yet for the minimum duration of a fixation 
(Inhoff & Radach, 1998), yet small differences in 
duration thresholds can make it hard to compare 
studies on an even footing (Goldberg & Wichansky, 
2003). Eye-tracking technology also needs to be 
improved to increase the validity and reliability of the 
recorded data. The robustness and accuracy of data 
capture need to be increased so that point-of-regard 
measurement stays accurate without the need for 
frequent recalibration. Data-collection, data-filter- 
ing, and data-analysis software should be stream- 
lined, so that they can work together without user 
intervention. The intrusiveness of equipment should 
be decreased to make users feel more comfortable, 
perhaps through the development of smaller and 
lighter head-mounted trackers. Finally, eye-tracking 
systems need to become cheaper in order to make 
them a viable usability tool for smaller commercial 
agencies and research labs (Jacob & Karn, 2003). 
Once eye tracking achieves these improvements in 
technology, methodology, and cost, it can take its 
place as part of a standard HCI toolkit. 

CONCLUSION 

Our contention is that eye-movement tracking rep- 
resents an important, objective technique that can 
afford useful advantages for the in-depth analysis of 
interface usability. Eye-tracking studies in HCI are 
beginning to burgeon, and the technique seems set to 
become an established addition to the current bat- 
tery of usability-testing methods employed by com- 
mercial and academic HCI researchers. This con- 
tinued growth in the use of the method in HCI studies 
looks likely to continue as the technology becomes 
increasingly more affordable, less invasive, and easier 
to use. The future seems rich for eye tracking and 
HCI. 
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KEY TERMS 

Area of Interest: An area of interest is an 
analysis method used in eye tracking. Researchers 
define areas of interest over certain parts of a 
display or interface under evaluation and analyze 
only the eye movements that fall within such areas. 

Eye Tracker: Device used to determine point- 
of-regard and to measure eye movements such as 
fixations, saccades, and regressions. Works by track- 
ing the position of various distinguishing features of 
the eye, such as reflections of infrared light off the 
cornea, the boundary between the iris and sclera, or 
apparent pupil shape. 

Eye Tracking: A technique whereby an 
individual’s eye movements are measured so that 



the researcher knows where a person is looking at 
any given time and how the a person’s eyes are 
moving from one location to another. 

Eye-Mind Hypothesis: The principle at the 
origin of most eye-tracking research. Assumes that 
what a person is looking at indicates what the person 
currently is thinking about or attending to. Recording 
eye movements, therefore, can provide a dynamic 
trace of where a person’s attention is being directed 
in relation to a visual display such as a system 
interface. 

Fixation: The moment when the eyes are rela- 
tively stationary, taking in or encoding information. 
Fixations last for 218 milliseconds on average, with 
a typical range of 66 to 416 milliseconds. 

Gaze: An eye-tracking metric, usually the sum of 
all fixation durations within a prescribed area. Also 
called dwell, fixation cluster, or fixation cycle. 

Point-of-Regard: Point in space where a per- 
son is looking. Usually used in eye-tracking research 
to reveal where visual attention is directed. 

Regression: A regressive saccade. A saccade 
that moves back in the direction of text that has 
already been read. 

Saccade: An eye movement occurring between 
fixations, typically lasting for 20 to 35 milliseconds. 
The purpose of most saccades is to move the eyes to 
the next viewing position. Visual processing is auto- 
matically suppressed during saccades to avoid blur- 
ring of the visual image. 

Scanpath: An eye-tracking metric, usually a 
complete sequence of fixations and interconnecting 
saccades. 
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INTRODUCTION 

The aim of this article is to describe a method that 
helps analysts to translate qualitative data gathered 
in the field, collected for the purpose of requirements 
specification, to a model usable for software engi- 
neers. 

Requirements specification constitutes three dif- 
ferent parts: functional requirements, quality re- 
quirements, and nonfunctional requirements. The 
first one specifies how the software system should 
function, who are the actors, and what are the input 
and output of the functions. The second one speci- 
fies what quality requirements the software should 
meet while operating in context of its environment 
such as reliability, usability, efficiency, portability, 
and maintainability. Finally, the third part specifies 
other requirements including context of use and 
development constraints. Examples of context of 
use are where and when the system is used, and 
examples of development constraints are human 
resources, cost and time constraints, technological 
platforms, and development methods. The role of the 
requirements specification is to give software engi- 
neers a basis for software design, and, later in the 
software development life cycle, to validate the 
software system. Requirements specification can 
also serve the purpose of validating users’ or cus- 
tomers’ view of the requirements of the system. 

On one hand, there has been a growing trend 
towards analyzing needs of the user and abilities 
through participatory design (Kuhn & Muller, 1993), 
activity theory (Bertelsen & Bpdker, 2003), contex- 
tual design (Beyer & Holtzblatt, 1998), user-cen- 
tered design (Gulliksen, Goransson, & Lif, 2001), 
and co-design and observation as in ethnography 



(Hughes, O’Brien, Rodden, Rouncefield, & 
Sommerville, 1995). Common to these methods is 
that qualitative data (Taylor & Bogdan, 1998) is 
collected, to understand the future environment of 
the new system, by analyzing the work or the tasks, 
their frequency and criticality, the cognitive abilities 
of the user and the users’ collaborators. The scope 
of the information collected varies depending on the 
problem, and sometimes it is necessary to gather 
data about the regulatory, social, and organizational 
contexts of the problem (Jackson, 1 995) to be solved. 
The temporal and spatial contexts describe when the 
work should be carried out and where. 

On the other hand, software engineers have 
specified requirements in several different model- 
ling languages that range from semiformal to formal. 
Examples of semiformal languages are UML 
(Larman, 2002), SADT, and IDEF (Ross, 1985). 
Examples of the latter are Z (Potter, Sinclair, & Till, 
1996), VDM or ASM. Those are modelling lan- 
guages for software development, but some lan- 
guages or methods focus on task or work modelling 
such as Concurrent Task Trees (Paterno, 2003) and 
Cognitive Work Analysis (Vicente, 1999). Others 
emphasize more the specification method than a 
modelling language. Examples of the former are 
Scenario-Based Design (SBD) (Rosson & Carroll, 
2002) and Contextual Design (Beyer & Holtzblatt, 
1998). In practice, many software developers use 
informal methods to express requirements in text, 
e.g., as narrations or stories. Agile Development 
Methods (Abrahamsson, Warsta, Siponen, & 
Ronkainen, 2003) emphasize this approach and 
thereby are consistent with their aim of de-empha- 
sizing methods, processes, and languages in favor of 
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getting things to work. A popular approach to 
requirements elicitation is developing prototypes. 

There has been less emphasis on bridging the gap 
between the above two efforts, for example, to deliver 
a method that gives practical guidelines on how to 
produce specifications from qualitative data (Hertzum, 
2003). One reason for this gap can be that people from 
different disciplines work on the two aspects, for 
example, domain analysts or experts in HCI, and 
software engineers who read and use requirements 
specification as a basis for design. Another reason for 
this gap may be in the difference in the methods that 
the two sides have employed, that is, soft methods 
(i.e., informal) for elicitation and analysis and hard 
methods (i.e., formal) for specification. 

In this article, we suggest a method to translate 
qualitative data to requirements specification that 
we have applied in the development of a Smart 



Space for Learning. The process borrows ideas 
from or uses scenarios, interviews, feature-based 
development, soft systems methodology, claims 
analysis, phenomenology, and UML. 

FOLLOWING A PROCESS 

The proposed process comprises five distinct steps 
or subprocesses, each with defined input and output 
(Table 1). The input to a step is the data or informa- 
tion that is available before the step is carried out, 
and the output is the work product or a deliverable of 
the step. We have also defined who, for example 
what roles, will carry out the steps. We will describe 
individual steps of the processes in more detail in 
subsequent sections. 



Table 1. Steps of an analysis process 



Who 


Input 


Step 


Output 


Domain analyst 


• Feature ideas 
from designers 

• Marketing 
information 

• CATWOE Root 
definition 


Elicitation design 


• Framework for research study in 
terms of questions and goals 

• Preliminary life cycle of artefacts 
in domain 

• Scenarios describing work in the 
domain 

• Scenarios describing work in the 
domain using new features 

• Features of new system 

• User selection 

• Time and place of data gathering 


Domain analyst 


• Access to user 
and context 

• Output from 
previous step 


Data gathering 


• Answers to questions in terms of 
textual, audio or video 
information 

• Modified scenarios, sequences of 
work 

• Artefacts 

• Results of claims analysis 


Domain analyst 
Requirement analyst 


Output from 
previous step 


Data analysis 


• Matrices 

• Implications as facts, 
phenomena, relationships 

• Conflicts and convergence of 
phenomena 


Requirement analyst 


Output from 
previous step 


Model specification 


• Entity, Actor, Stimulus, Events, 
Behavior and Communication 
model 


Domain analysts 
Requirement analyst 


• Output from 
previous step 

• Access to user 
and customer 


Validation 


Revised model 
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Elicitation Design 

The specific goals of field studies are different from 
one project to another. Examples of goals include 
analysing hindrances, facilitators of work, qualities, 
and context. The general goal is to build a software 
system that solves a problem and/or fulfills the actor’ s 
goals. 

To prepare for visiting the work domain, we 
design a framework that should guide us to elicit 
information on what type of work is carried out, who 
does it, in what context, and the quality of work such 
as frequency, performance, and criticality. Analysts 
should gather as much background information be- 
forehand as possible, for example, from marketing 
studies, manuals, current systems, and so forth. To 
understand the context of the system, a CATWOE 
analysis (Checkland & Scholes, 1999) is conducted. 

Before the field studies, designers may brain- 
storm some possible features of the system. When 
creating a system for a new technological environ- 
ment, designers tend to have some ideas on how it 
can further advance work. The work product of this 
step should be an Elicitation Design in a form of a 
handbook that describes overall goals of the contex- 
tual inquiries. In qualitative studies, researchers should 
not predefine hypotheses before the inquiry, but they 
can nonetheless be informed. Instead of visiting the 
work domain empty-handed, we suggest that ana- 
lysts be prepared with descriptions of hypothetical 
scenarios or narrations of work with embedded ques- 
tions on more details or procedures in this particular 
instance. Fictional characters can be used to make 
the scenarios more real. 

Work is determined by some life cycle of artefacts. 
The questions can be formed in the framework of 
such a life cycle. In many domains, a life cycle of 
artefacts is already a best practice, and the goal can 
be to investigate how close the particular instance is 
to the life cycle. To reach this goal, several initial 
questions can be designed, bearing in mind that more 
can be added during the field study. Another way to 
reach this goal is through observation of work with 
intermittent questions. 

A part of the preparation is the selection of clients 
and planning of the site visits. Analysts should recog- 
nize that users could be a limited resource. 



Data Gathering 

On site, the researcher uses the Elicitation Design 
framework to inquire about the work domain. The 
first part of the interview gathers information through 
questions about the current work domain with the 
help of hypothetical scenarios. In the second part of 
data gathering, claims analysis (Rosson & Carroll, 
2002) is performed on suggested new features of a 
system. In claims analysis, people are asked to give 
positive and negative consequences to users of 
particular features. An example can be: Apples are 
sweet and delicious (positive) but they are of many 
types that make it difficult to choose from. 

It is useful to notice best practice in interviewing, 
and ask the interviewee for concrete examples and 
to show evidence of work or artefacts. The inter- 
viewer should act as an apprentice, show interest in 
the work and be objective. Data is gathered by 
writing down answers and recording audio. It is 
almost necessary to have a team of two persons 
where one is the writer and the other is the inter- 
viewer. 

Data Analysis 

As soon as the site visit is over, the analyst should 
review the data gathered and write down additional 
facts or observations that he/she can remember, 
and encode the audiotapes to text. After this prepa- 
ration, the actual analysis takes place, but the 
analysis can start already at the data-gathering 
phase. With large amounts of data, as is usual in 
qualitative studies, there is a need to structure it to 
see categorizations of phenomena or facts. Data is 
initially coded to derive classes of instances. There 
are several ways of doing this, but we propose to 
use one table for each goal or question, where the 
lines are individual instances of facts observed from 
a domain client and the columns are answers. The 
data is coded by creating a column that can be an 
answer to a question. The columns can subse- 
quently be categorized, thus building a hierarchy of 
codes. 

The output of the analysis is to derive understand- 
ing of phenomena, their characteristics, relation- 
ships, and behavior. We can formulate this knowl- 
edge in propositions and theories . One way of build- 
ing theory, attributed to Glaser and Strauss, is called 
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the grounded theory approach (Taylor & Bogdan, 
1998). 

Model Specification 

From the analysis, we derive the model specifica- 
tion. It is described with different element types that 
are listed in Table 2. A further step can be taken to 
refine this to a more formal model, using for ex- 
ample, UML or Z. 

Validation 

The last step of the process is the validation. The 
analyst walks through the model and reviews it in 
cooperation with stakeholders. 

VALIDATION IN A CASE STUDY 

The method has been successfully applied in a case 
study of eliciting and specifying requirements for a 
software system that supports corporations with 
training management. This system is called Smart 
Space for Learning (SS4L). 



This section describes the needs assessment 
resulting from companies’ interviews, but the entire 
analysis is outside the scope of this article and will be 
described elsewhere. The aim of SS4L is to support 
companies in offering effective learning to their 
employees, which serves the needs of organizations, 
intelligent support to employees through personal- 
ization and learning profiles and open interfaces 
between heterogeneous learning systems and learn- 
ing content. 

Seven processes of the learning life cycle were 
introduced for the interviews: Training needs, Train- 
ing goals, Input controlling, Process controlling, 
Output controlling, Transfer controlling, and Out- 
come controlling. These processes were built on the 
evaluation model of Kirkpatrick (1996) with its four- 
level hierarchy. Fie identifies four levels of training 
evaluation: reaction (do they like it?), learning (do 
they learn the knowledge?), transfer (does the knowl- 
edge transfer?), and influence (do they use it and 
does it make a difference to the business?). The 
study was conducted in 1 8 companies in five differ- 
ent countries. 

The aim of this qualitative requirements study 
was threefold. First, we aimed at assessing the 
current and future needs of corporate training man- 



Table 2. Different types of elements in the model 



Elements 


Description 


Example 


Entity 


An entity is a 
representation of 
something abstract or 
concrete in the real 
world. An entity has a 
set of distinct attributes 
that contain values. 


Knowledge 


Actor 


Actor is someone that 
initiates an action by 
signalling an event. 


Human resource manager 


Stimulus 


Stimulus is what gets 
the Actor to act. This 
may be an event or a 
condition. 


Company needs to satisfy 
customer demand 


Input 


Input is the information 
that is needed to 
complete the Behavior. 
The Input can be in the 
form of output from 
other Behaviours. 


Knowledge Grade of 
Employees 


Output 


Output is the result of 
the Behavior. 


Knowledge Grade of 
Department 


Communication 


Communication is an 
abstraction of the 
activity that enables 
transfer of information 
between two Behaviors. 
It can contain input, 
output data or events. 


Communication to 
Analyze- 

Knowledge(Department) 


Behaviour 


Behavior is a sequence 
of actions. It can also be 
termed a process. 


Analyze- Knowledge 
(Company) 
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agement. A second objective was to collect ideas 
from companies on what type of IT-support they 
foresee to be useful. Finally, a third objective was to 
find out positive and negative consequences to users 
of the features developed in SS4L. 

The main results of the requirements studies are: 

1. Training management life cycle as a qual- 
ity model for learning. We have seen through 
the interviews that measuring the success of a 
course or a seminar implies an overall quality 
model for learning. First, one has to define the 
goals, plan the strategy for implementation, and 
then it can be assessed whether the results 
meet the goals. The interviews showed that 
some companies are still lacking this overall 
quality model. Only some steps of the overall 
process are performed, and therefore the train- 
ing management leads to unsatisfactory re- 
sults. 

2. Functions, that will be useful to have in an IT 
system for training management, were ana- 
lyzed. 

• Assessment of current and needed knowl- 
edge. 

• Targeted communication of strategies. 

• Peer evaluation or consulting. 

• Problem driven and on-demand learning. 

• Assessment of transfer during work by 
experienced colleagues. 

• Budget controlling. 

3. Stakeholders demand certain qualities of a 
training management system. During the 
interview and the claims analysis, users also 
demanded other qualities. Confidence and 
quality of data has to be ensured when imple- 
menting a system that includes features that 
rely on contextual information such as personal 
or corporate profile and recommendation of 
peers. In general, all companies that partici- 
pated in this study revealed a certain concern 
about training management and the need for 
supportive systems that allow for a greater 
transparency. 

The training management life cycle proved to be 
a useful framework for the study. The scenarios 
were useful to set a scope for the interviews and 
they made the discussion easier. Some practice is 



required to get interviewees to give evidence through 
artefacts and explain answers with examples from 
the domain. As in any qualitative study, the data 
analysis is quite tedious, but coding the interviews 
into tables proved useful and a good basis for 
determining the components of the model as de- 
scribed in Table 2. A good tool for tracking the data, 
that is, from answers to tables to models will give 
researchers confidence that no data is lost and that 
it is correct. Using a combination of interviews and 
claims analysis for proposed features creates inter- 
esting results and may reveal inconsistencies. For 
example, users express the desire to select training 
courses based on the evaluation of peers but are 
reluctant themselves to enter ratings of training 
courses or other personal information because of 
threat to privacy. The final steps in the study are to 
translate the activities to use cases (Larman, 2002) 
and to prioritize them and determine their feasibility. 

FUTURE TRENDS 

There seems to be a growing interest in qualitative 
methods that require analysts to gather data in 
context. The reason may be that software systems 
are more than ever embedded into the environment 
and used in every day life, requiring access for 
everybody. Developers will expect to make deci- 
sions on design based on empirical studies. This may 
call for combined qualitative and quantitative meth- 
ods. The cause for the latter is that it may be easier 
to collect extensive usage information in a larger 
population and a triangulation of methods will be 
used to retrieve reliable results. 

Although interdisciplinary research has been and 
continues to be difficult, there is a growing trend to do 
research at the boundaries in order to discover inno- 
vative ideas and to work cross-discipline. The motiva- 
tion is that there is a growing awareness to examine 
the boundaries between software systems, humans 
and their cognitive abilities, artificial physical systems, 
and biological and other natural systems. 

CONCLUSION 

This article has described a method for bridging the 
gap between the needs of the user and requirements 
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specification. In a case study, we have elicited 
requirements by interviewing companies in various 
countries, with the aid of tools such as scenarios 
linked with questions related to a process life cycle. 
A qualitative analysis has given us insight into cur- 
rent practices and future needs of corporations, 
thereby suggesting innovative features. Interviews 
followed by claims analysis of suggested features 
that indicated positive and negative consequences 
have helped us analyze what features users wel- 
comed and which have been rejected or need to be 
improved. Future work will entail further validation 
of the method for modelling of the functional require- 
ments for developers. 
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KEY TERMS 

Actor: Someone that initiates an action by sig- 
nalling an event. An actor is outside a system and 
can be either another system or a human being. 

CATWOE: Clients are the stakeholders of the 
systems, Actors are the users, Transformation de- 
scribes the expected transformations the system will 
make on the domain, Worldview is a certain aspect 
of the domain we have chosen to focus on, Owners 
can stop the development of the system, and Envi- 
ronment describes the system context. 

Context: Everything but the explicit input and 
output of an application. Context is the state of the 
user, state of the physical environment, state of the 
computational environment, and history of user- 
computer-environment interaction. (Lieberman & 
Selker, 2000) Put another way, one can say that 
context of the system includes, Who is the actor, 
When (e.g., in time) an actor operates it, Where the 
actor operates it. Why or under what conditions the 
actor activates the system or What stimulates the 
use (Abowd & Mynatt, 2002). 

Contextual Inquiry: Context, partnership, in- 
terpretation, and focus are four principles that guide 
contextual inquiry. The first and most basic require- 
ment of contextual inquiry is to go to the customer’ s 
workplace and observe the work. The second is that 
the analysts and the customer together in a partner- 
ship understand this work. The third is to interpret 
work by deriving facts and make a hypothesis that 
can have an implication for design. The fourth 
principle is that the interviewer defines a point of 
view while studying work (Beyer & Holtzblatt, 
1998). 

Prototype: Built to test some aspects of a sys- 
tem before its final design and implementation. 
During requirements elicitation, a prototype of the 
user interface is developed, that is, to give stakehold- 
ers ideas about its functionality or interaction. Pro- 
totypes are either high fidelity, that is, built to be very 
similar to the product or low fidelity with very 
primitive tools, even only pencil and paper. Proto- 
types can be thrown away, where they are dis- 
carded, or incremental, where they are developed 
into an operational software system. 



Qualitative Methodology: “The phrase quali- 
tative methodology refers in the broadest sense to 
research that produces descriptive data - people’s 
own written or spoken words and observable 
behaviour” (Taylor & Bogdan, 1998, p. 7). 

Requirements Elicitation: “Requirements elici- 
tation is the usual name given to activities involved in 
discovering the requirements of the system” (Kotonya 
& Sommerville, 1998, p. 53). 

Requirements Specification: Provides an over- 
view of the software context and capabilities. For- 
mally, the requirements should include: 

• Functional Requirements 

• Data Requirements 

• Quality Requirements 

• Constraints 

Software Quality: A quality model (ISO/IEC 
9126-1, 2001) categorises software quality attributes 
into the following six characteristics that are again 
subdivided into sub-characteristics. The character- 
istics are specified for certain conditions of the 
software product: 

• Functionality: The software product provides 
functions which meet needs. 

• Reliability: The software product maintains 
performance. 

• Usability: The software product should be 
understood, learned, used, and attractive to 
user. 

• Efficiency: The software product provides 
appropriate performance relative to the amount 
of resources used. 

• Maintainability: The software product should 
be modifiable. Modifications include correc- 
tions, improvements, or adaptations. 

• Portability: The software product can be trans- 
ferred from one environment to another. 

System Model: A system model is a description 
of a system. Initially, the model describes what 
problem the system should solve and then it can be 
gradually refined to describe how the system solves 
the problem. Finally, when operational, the system 
can be viewed as a model of some domain behaviour 
and characteristics. 
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INTRODUCTION 

As the popularity of the Internet has expanded, an 
increasing number of people spend time online. 
More than ever, individuals spend time online read- 
ing news, searching for new technologies, and chat- 
ting with others. Although the Internet was designed 
as a tool for computational calculations, it has now 
become a social environment with computer-medi- 
ated communication (CMC). 

Picard and Healey (1997) demonstrated the po- 
tential and importance of emotion in human-com- 
puter interaction, and Bates (1992) illustrated the 
roles that emotion plays in user interactions with 
synthetic agents. 

Is emotion communication important for human- 
computer interaction? 

Scott and Nass (2002) demonstrated that hu- 
mans extrapolate their interpersonal interaction pat- 
terns onto computers. Humans talk to computers, 
are angry with them, and even make friends with 
them. In our previous research, we demonstrated 
that social norms applied in our daily life are still valid 
for human-computer interaction. Furthermore, we 
proved that providing emotion visualisation in the 
human-computer interface could significantly influ- 
ence the perceived performances and feelings of 
humans. For example, in an online quiz environment, 
human participants answered questions and then a 
software agent judged the answers and presented 
either a positive (happy) or negative (sad) expres- 
sion. Even if two participants performed identically 
and achieved the same number of correct answers, 



the perceived performance for the one in the posi- 
tive-expression environment is significantly higher 
than the one in the negative-expression environment 
(Xu, 2005). 

Although human emotional processes are much 
more complex than in the above example and it is 
difficult to build a complete computational model, 
various models and applications have been devel- 
oped and applied in human-agent interaction envi- 
ronments such as the OZ project (Bates, 1992), the 
Cathexis model (Velasquez, 1 997), and Elliot’ s ( 1 992) 
affective reasoner. 

We are interested in investigating the influences 
of emotions not only for human-agent communica- 
tion, but also for online human-human communica- 
tions. The first question is, can we detect a human’ s 
emotional state automatically and intelligently? 

Previous works have concluded that emotions 
can be detected in various ways — in speech, in 
facial expressions, and in text — for example, inves- 
tigations that focus on the synthesis of facial expres- 
sions and acoustic expression including Kaiser and 
Wehrle (2000), Wehrle, Kaiser, Schmidt, and Scherer 
(2000), and Zentner and Scherer (1998). As text is 
still dominating online communications, we believe 
that emotion detection in textual messages is par- 
ticularly important. 

BACKGROUND 

Approaches for extracting emotion information from 
textual messages can be classified into the catego- 
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ries of keyword tagging, lexical affinity, statistical 
methods, or real-world models (Liu, Lieberman, & 
Selker,2003). 

We have developed a textual emotion-extraction 
engine that can analyze text sentences typed by 
users. The emotion extraction engine has been pre- 
sented by Xu and Boucouvalas (2002). 

The emotion-extraction engine can analyze sen- 
tences, detect emotional content, and display appro- 
priate expressions. The intensity and duration of the 
expressions are also calculated and displayed in real 
time automatically. The first version of our engine 
searched for the first person, I, and the current 
tense, therefore the ability of the engine was very 
limited. In our latest version, the engine applies not 
only grammatical knowledge, but also takes real- 
word information and cyberspace knowledge into 
account. It intends to satisfy the demands of compli- 
cated sentence analysis. 

The user’s mood is defined as the feelings per- 
ceived from a user’ s series are input in the emotion- 
extraction engine. The current emotion of a user is 
based totally on the information assessed within a 
single sentence. 

A user’s mood may not be consistent with the 
current emotion of the user. For example, a user may 
present a sad feeling in one sentence, but previously 
the user was talking about happy and interesting 
things. The sad feeling presented may not be a 
significant emotion and overall the user’ s mood may 
be still happy. 

To calculate the mood of a user, previous emo- 
tions and current emotions need to be analyzed 
together. We assume that emotions are additive and 
cumulative. One way of calculating the mood is to 
average the historic emotions and then find out what 
category the averaged emotion is in. This approach 
is described by Xu (2005). Here, an alternative 
fuzzy-logic approach is presented. 

Fuzzy Logic 

Fuzzy logic was developed to deal with concepts that 
do not have well-defined, sharp boundaries (Bezdek, 
1989), which theoretically is ideal for emotion as no 
well-defined boundaries are defined for emotion 
categories (e.g., happiness, sadness, surprise, fear, 
disgust, and anger). 



The transition from one physiological state to 
another is a gradual one. These states cannot be 
treated as classical sets, which either wholly include 
a given affect or exclude it. Even within the physi- 
ological response variables, one set merges into 
another and cannot be clearly distinguished from 
another. For instance, consider two affective states: 
a relaxed state and an anxious state. If classical sets 
are used, a person is either relaxed or anxious at a 
given instance, but not both. The transition from one 
set to another is rather abrupt and such transitions do 
not occur in real life. 



EMOTION EXTRACTION ENGINE 

The emotion extraction engine is a generic prototype 
based on keyword tagging and real-world knowl- 
edge. Figure 1 depicts an overview of the architec- 
ture of the emotion-extraction engine. 

The sentence analysis component includes three 
components: input analysis, the tagging system, and 
the parser. The input-analysis function splits textual 
messages into arrays of words and carries out initial 
analysis to remove possible errors in the input. The 
tagging system converts the array of words into an 
array of tags. The parser uses rewrite rules and AI 
(artificial intelligence) knowledge to carry out infor- 
mation extraction. The engine classifies emotions 
into the following categories: happiness, sadness, 
surprise, fear, disgust, and anger. For further details, 
please refer to Xu and Boucouvalas (2002) and Xu 
(2005). This article only discusses the fuzzy-logic 
components, which can be seen as an extension to 
the parser. With fuzzy logic methods, the emotion- 
extraction engine can be used to analyze complex 
situations. 

Conflicting Emotion Detection 

The inputs of the conflicting-emotion detection com- 
ponent are the emotion parameters that are passed 
from the sentence analysis component. As mixed 
emotions are a common phenomenon in daily life, it 
is not unusual for a user to type in a sentence, such 
as, “I am happy that I got a promotion, but it is sad 
that my salary is cut,” that contains mixed emotions 
in an online chatting environment. 
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Figure 1. The emotion extraction engine overview 




When a sentence contains conflicting emotions, 
judging which emotion represents the overall emo- 
tional feeling is not only based on the current sen- 
tence, but also on the mood. For example, in the 
emotion extraction engine, the mood happy indicates 
that the previous messages an individual typed con- 
tain overwhelmingly more happy feelings than oth- 
ers. When the user types a sentence containing both 
happy and sad emotions, the perceived current mood 
of the user may still be happy instead of happy and 
sad. The reason is that the individual was in a 
predominately happy mood, and the presented sad 
emotion may not be significant enough to change the 
mood from happy to sad. 

The positive emotion category and negative emo- 
tion category are introduced to handle conflicting 
emotions. Positive emotions include happiness and 
surprise, while negative emotions are sadness, fear, 
anger, and disgust. A sentence is treated as a con- 
flicting-emotion sentence only if the sentence con- 
tains both positive and negative emotions. 



When a sentence with conflicting emotions is 
found, the emotion-filter component will be called; 
otherwise, the emotion parameters are passed to 
the mood-selection component for further opera- 
tion. 

Mood Selection Component 

The inputs of the mood-selection component are the 
emotion parameters from the sentence emotion 
extraction component and the previous emotions 
stored in the emotion-storage component. 

The aim of the mood selection component is to 
determine the current mood. To achieve this, the 
first step of the mood selection component is to 
convert the emotion parameters into the current 
emotions by filtering the tense information. For 
example, the emotion parameter [happiness] [middle 
intensity] [present tense] is converted to the current 
emotion [happiness] [middle intensity]. The current 
emotion is sent to the storage component as well. 
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The previous emotions of the user are stored in the 
storage component. The format is [emotion 
category] [intensity] . To covert the emotion data into 
the format acceptable for a fuzzy system, the follow- 
ing fuzzy-data calculations are carried out. 

Fuzzy Data Calculation 

An array E is assigned to contain the accumulative 
intensity values of the six emotion categories. The 
array elements 0 to 5 in turn represent the accumu- 
lative intensity of the emotions happiness, surprise, 
anger, disgust, sadness, and fear. 

The value of each element in array E is calcu- 
lated by adding the five previous intensities of a 
specific emotion category with the current intensity 
of that emotion. Equation 1 applied to calculate the 
accumulative intensity is shown as follows. 

0 

E[x\ = ^ a,/,, (x), where x = 0, 1 , 2, 3, 4, 5 ( l ) 



The values of array E depend on the relative 
intensity over the last n time periods; n is chosen to 
be 5 as it is assumed that in a chatting environment 
users only remember the most recent dialogs. 7.(x) is 
the intensity of emotion category x at discrete time 
i, and the value of 7.(x) varies from 0 to 3, which 
represents the lowest intensity to the highest inten- 
sity. When i is 0, 7.(x) contains the intensity values of 
the current emotions. 

Instead of adding up the unweighted previous 
intensities, the intensities are weighted according to 
the time. Velasquez (1997) declared that emotions 
do not disappear once their cause has disappeared, 
but rather they decay through time. In the FLAME 
project, El-Nasr, Yen, and Ioerger (2000) follow 
Velasquez’s view and choose to decay positive 
emotions at a faster rate. However, there is not 
enough empirical evidence from El-Nasr et al.’s 
implementation or this implementation to establish 
the actual rate of the decay. In the emotion extrac- 
tion engine, the positive and negative emotions are 
assumed to decay at the same rate and the influence 
period is chosen to be five sentences. Figure 2 
illustrates the assumption. 

In Figure 2, Value represents the value of the 
perceived influence of emotion and t represents 



Figure 2. The decay of emotion over time 




time. EMO represents the emotion that occurred at 
the discrete time point (e.g., the time when a chat 
message is input). In this figure, the value of EMO 
at different time points is the same, which means that 
a user typed six emotional sentences of the same 
intensity. Zero represents the current time. 

However, at time point 0, the perceived influence 
of EMO that occurred at time -5 is 0, which means 
the influence of the emotion input at time -5 has 
disappeared. The EMO that occurred at time - 1 is 
the least decayed and has the strongest influence at 
the current time (time 0). 

In the fuzzy emotion-analysis component, the 
emotion decay is represented by the weight param- 
eter, and it is calculated using Equation 2. 



a, =a M -0.1 where i <=-5,-4, -3, -2 
a A = 0.5 
a 0 =1 

( 2 ) 

Fuzzy Membership Functions 

The following fuzzy membership functions and fuzzy 
logic rules are applied to assist the mood calculation. 
Two fuzzy membership functions high and low 
(Equations 3 and 4) are defined to guide the analysis 
of fuzzy functions. 



high(x) = 



E[x] 

i=0 



( 3 ) 



230 



Fuzzy Logic Usage in Emotion Communication of Human Machine Interaction 



low(x) = 1 - 



E M 
/= 0 



Fuzzy Rules 



STORAGE COMPONENT 

The inputs of the storage component are the current 
emotions. The storage component is implemented as 
a first in, first remove (FIFR) stack with a length of 
five. The structure is shown in Figure 3. 



Fuzzy rules are created based on the high and low 
membership functions. The rule base includes the 
following. 

Rule 1 : 

IF the emotion with largest intensity is “high” 
AND the other emotions are “low” 

THEN the current mood is that emotion 

Rule 2: 

IF the positive emotions are “high 
AND the negative emotions are “low” 

THEN the current mood is the highest positive 
emotion and the intensity is decreased by 0.1 



EMOTION FILTER 

The emotion filter is designed to analyze the de- 
tected conflicting emotions. The inputs of the filter 
include the current conflicting emotions and the 
emotions stored in the storage component. Similar 
fuzzy logic membership functions and rules are 
applied to analyze the conflicting emotions. The only 
difference is that here the current emotion data are 
excluded. The detailed fuzzy membership functions 
and rules are not discussed here. Readers can find 
the details in the mood-selection-component sec- 
tion. 



Rule 3: 

IF the negative emotions are “high” 

AND positive emotions are “low” 

THEN the current mood is the highest negative 
emotion and the intensity is decreased by 0. 1 

MOOD SELECTION 



FUTURE TRENDS 

Fuzzy logic is a popular research field in autocontrol, 
AI, and human factors. As adaptivity becomes an 
extremely important interface design criteria, fuzzy 
logic shows its own advantage in creating adaptive 



When a dialog starts, there are no cues as to what 
mood a user is in. However, it is reasonable to 
assume that the mood of a new user is neutral. As 
the emotion extraction engine acquires more data 
(e.g., the user starts chatting), the above fuzzy rules 
can be applied and the mood of the user can be 
calculated. The centre of gravity (COG) point is an 
important measurement factor in determining the 
dominant rule. In this implementation, the COG point 
is calculated as the average of the rules’ outputs 
(Equation 5). 

3 

V rule(x) 

COG = 4 (5) 



Figure 3. The structure of the storage component 
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systems: There are no clear category boundaries, 
and it is easily understood by humans. Fuzzy logic 
can be integrated with other techniques (e.g., induc- 
tive logic, neural networks, etc.) to create fuzzy- 
neural or fuzzy-inductive systems, which can be 
used to analyze complex human-computer interac- 
tions. For this article, possible future studies may 
include a comparison of the emotions detected by 
the engine with emotions by human observers. Also, 
the applicability of the fuzzy-logic functions with 
different events should be tested by comparing the 
performance of the emotion-extraction engine to the 
different contexts that the emotion-extraction en- 
gine is applied. 

CONCLUSION 

This article presents an overview of the fuzzy logic 
components applied in our emotion-extraction en- 
gine. Fuzzy-logic rules are followed to determine the 
correct mood when conflicting emotions are de- 
tected in a single sentence. The calculation of cur- 
rent mood involves an assessment of the intensity of 
current emotions and the intensity of previously 
detected emotions. This article presents an example 
of fuzzy-logic usage in human-computer interaction. 
Similar approaches can be tailored to fit situations 
like emotional-system interaction, telecare systems, 
and cognition-adaptive systems. 
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KEY TERMS 

CMC (Computer-Mediated Communica- 
tion): The human use of computers as the medium 
to communicate to other humans. 

Current Emotion: The current emotion refers 
to the emotion contained in the most recent sen- 
tence. 

Current Mood: The current mood refers to the 
weighted average emotion in the five most recent 
sentences. 



Emotion Decay: The gradual decline of the 
influence of emotion over time. 

Emotion-Extraction Engine: A software sys- 
tem that can extract emotions embedded in textual 
messages. 

Emotion Filter: The emotion filter detects and 
removes conflicting emotional feelings. 

Fuzzy Logic: Fuzzy logic is applied to fuzzy sets 
where membership in a fuzzy set is a probability, not 
necessarily 0 or 1 . 
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INTRODUCTION 

The rapid progress in information technology (IT) 
has moved computing and the Internet to the main- 
stream. Today’s personal laptop computer has com- 
putational power and performance equal to 10 times 
that of the mainframe computer. Information tech- 
nology has become essential to numerous fields, 
including city and regional planning engineering. 
Moreover, IT and computing are no longer exclusive 
to computer scientists/engineers. There are many 
new disciplines that have been initiated recently 
based on the cross fertilization of IT and traditional 
fields. Examples include geographical information 
systems (GIS), computer simulation, e-commerce, 
and e-business. The arrival of affordable and pow- 
erful computer systems over the past few decades 
has facilitated the growth of pioneering software 
applications for the storage, analysis, and display of 
geographic data and information. The majority of 
these belong to GIS (Batty et al., 1994; Burrough et 
al., 1980; Choi & Usery, 2004; Clapp et al., 1997; 
GIS @ Purdue, 2003; Golay et al., 2000; Goodchild et 
al., 1999;IFFD, 1998; Jankowski, 1995; Joerinetal., 
2001; Kohsaka, 2001; Korte, 2001; McDonnell & 
Kemp, 1995; Mohan, 2001; Ralston, 2004; Sadoun, 
2003; Saleh & Sadoun, 2004). 



GIS is used for a wide variety of tasks, including 
planning store locations, managing land use, planning 
and designing good transportation systems, and aid- 
ing law enforcement agencies. GIS systems are 
basically ubiquitous computerized mapping programs 
that help corporations, private groups, and govern- 
ments to make decisions in an economical manner. 
A GIS program works by connecting information/ 
data stored in a computer database system to points 
on a map. Information is displayed in layers, with 
each succeeding layer laid over the preceding ones. 
The resulting maps and diagrams can reveal trends 
or patterns that might be missed if the same informa- 
tion was presented in a traditional spreadsheet or 
plot. 

A GIS is a computer system capable of capturing, 
managing, integrating, manipulating, analyzing, and 
displaying geographically referenced information. 
GIS deals with spatial information that uses location 
within a coordinate system as its reference base 
(see Figure 1). It integrates common database op- 
erations such as query and statistical analysis with 
the unique visualization and geographic analysis 
benefits offered by maps. These abilities distinguish 
GIS from other information systems and make it 
valuable to a wide range of public and private 
enterprises for explaining events, predicting out- 



Figure 1. A coordinate system (GIS@Purdue 2003) 




Point P has a latitude of 50 degress North and a longitude of 60 



degrees West 
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comes, and planning strategies (Batty et al., 1994; 
Burrough et al, 1980; Choi & Usery, 2004; Clapp et 
al., 1997; GIS@Purdue, 2003; Golay et al., 2000; 
Goodchild et al., 1999; IFFD, 1998; Jankowski, 
1995; Joerin et al., 2001; Kohsaka, 2001; Korte, 
2001; McDonnell & Kemp, 1995; Mohan, 2001; 
Ralston, 2004; Sadoun, 2003; Saleh & Sadoun, 2004). 



BACKGROUND 

A working GIS integrates five key components: 
hardware, software, data, people, and methods. GIS 
stores information about the world as a collection of 
thematic layers that can be linked together by geog- 
raphy. GIS data usually is stored in more than one 
layer in order to overcome technical problems caused 
by handling very large amounts of information at 
once (Figure 2). This simple but extremely powerful 
and versatile concept has proved invaluable for 
solving many real-world problems, such as tracking 
delivery vehicles, recording details of planning appli- 
cations, and modeling global atmospheric circula- 



Figure 2. Illustration of GIS data layers 
(GIS@Purdue, 2003) 




tion. GIS technology, as a human-computer interac- 
tion (HCI) tool, can provide an efficient platform 
that is easy to customize and rich enough to support 
a vector-raster integration environment beyond the 
traditional visualization. 

A GIS has four main functional subsystems: (1) 
data input, (2) data storage and retrieval, (3) data 
manipulation and analysis, and (4) data output and 
display subsystem. A data input subsystem allows 
the user to capture, collect, and transform spatial 
and thematic data into digital form. The data inputs 
usually are derived from a combination of hard copy 
maps, aerial photographs, remotely sensed images, 
reports, survey documents, and so forth. Maps can 
be digitized to collect the coordinates of the map 
features. Electronic scanning devices also can be 
used to convert map lines and points to digital 
information (see Figure 3). 

The data storage and retrieval subsystem orga- 
nizes the data, spatial and attribute, in a form that 
permits them to be retrieved quickly by the user for 
analysis and permits rapid and accurate updates to 
be made to the database. This component usually 
involves use of a database management system 
(DBMS) for maintaining attribute data. Spatial data 
usually is encoded and maintained in a proprietary 
file format. 

The data manipulation and analysis subsystem 
allows the user to define and execute spatial and 



G 



Figure 3. A digitizing board with an input device 
to capture data from a source map (GIS@Purdue, 
2003) 
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attribute procedures to generate derived information. 
This subsystem commonly is thought of as the heart 
of a GIS and usually distinguishes it from other 
database information systems and computer-aided 
drafting (CAD) systems. The data output subsystem 
allows the user to generate graphic displays — nor- 
mally maps and tabular reports representing derived 
information products. 

The basic data types in a GIS reflect traditional 
data found on a map. Accordingly, GIS technology 
utilizes two basic data types: (1) spatial data that 
describes the absolute and relative location of geo- 
graphic features and (2) attribute data, which de- 
scribe characteristics of the spatial features. These 
characteristics can be quantitative and/or qualitative 
in nature. Attribute data often is referred to as tabular 
data. 

GIS works with two fundamentally different types 
of geographic models: vector and raster (see Figure 
4). In the vector model, information about points, 
lines, and polygons is encoded and stored as a collec- 
tion of xy coordinates. The location of a point feature 
can be described by a single x y coordinate. Linear 
features, such as roads and rivers, can be stored as 
a collection of point coordinates. Polygonal features, 
such as sales territories, can be stored as a closed 
loop of coordinates. 

The vector model is extremely useful for describ- 
ing discrete features but less useful for describing 
continuously varying features such as soil type or 
accessibility costs to hospitals. The raster model has 
evolved to model such continuous features. A raster 
image comprises a collection of grid cells such as a 

Figure 4. GIS data model (IFFD, 1998) 




scanned map/picture. Both vector and raster mod- 
els for storing geographic data have unique advan- 
tages and disadvantages. Modern GISs are able to 
handle both models. 



GIS SOFTWARE 

A number of GIS software packages exist commer- 
cially, providing users with a wide range of applica- 
tions. Some of them are available online for free. 
Before selecting a GIS package, the user should 
find out whether the selected GIS can meet his or 
her requirements in four major areas: input, manipu- 
lation, analysis, and presentation. The major GIS 
vendors are ESRI, Intergraph, Landmark Graphics, 
and Maplnfo. A brief description of these packages 
is given next. 

ESRI Packages 

The Environmental Systems Research Institute 
(ESRI) provides database design application, data- 
base automation, software installation, and support 
(Korte, 2001; McDonnell et al., 1995; Ralston, 
2004). ARC/INFO allows users to create and man- 
age large multi-user spatial databases, perform 
sophisticated spatial analysis, integrate multiple data 
types, and produce high-quality maps for publica- 
tion. ARC/INFO is a vector-based GIS for storing, 
analyzing, managing, and displaying topologically 
structured geographic data. 

ArcView is considered the world’s most popular 
desktop GIS and mapping software. ArcView pro- 
vides data visualization, query, analysis, and inte- 
gration capabilities along with the ability to create 
and edit geographic data. ArcView makes it easy to 
create maps and add users’ own data. By using 
ArcView software’ s powerful visualization tools, it 
is possible to access records from existing data- 
bases and display them on maps. The ArcView 
network analyst extension enables users to solve a 
variety of problems using geographic networks 
such as streets, ATM machines, hospitals, schools, 
highways, pipelines, and electric lines. Other pack- 
ages by ESRI include Database Integrator, 
ArcStorm, ArcTools, and ArcPress. 
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Intergraph Packages 

Intergraph specializes in computer graphics systems 
for CAD and produces many packages. Intergraph' s 
products serve four types of GIS users: managers, 
developers, viewers, and browsers. Intergraph 
bundles most popular components of Modular GIS 
Environments (MGEs) into a single product called 
GIS Office. The components are (a) MGE Basic 
Nucleus for data query and review, (b) MGE Base 
Mapper for collection of data, (c) MGE Basic Ad- 
ministrator for setup and maintenance of database, 
and (d) MGE Spatial Analyst for creation, query, 
displaying of topologically structured geographic 
data, and spatial analysis (Korte, 2001; McDonnell 
& Kemp, 1995; Ralston, 2004). 

GeoMedia is Intergraph’s first GIS product to 
implement its Jupiter technology. Jupiter functions 
without a CAD core and uses its object, graphics, 
and integration capabilities provided by object link- 
ing and embedding and component object model 
standards of the Windows operating system to inte- 
grate technical applications with office automation 
software (Korte, 2001; McDonnell & Kemp, 1995; 
Ralston, 2004). 

Maplnfo Packages 

Maplnfo offers a suite of desktop mapping products 
that are different from other leading GIS products 
like MGE and ARC/INFO in that they do not store 
a fully topologically structured vector data model of 
map features. Maplnfo Professional provides data 
visualization; step-by-step thematic mapping; and 
three linked views of data maps, graphs, and tables. 
It displays a variety of vector and raster data for- 
mats. It enables users to digitize maps to create 
vector images. Moreover, it enables users to per- 
form editing tasks such as selecting multiple nodes 
for deletion, copying, and clearing map objects and 
overlaying nodes. Maplnfo Professional enables 
multiple data views with a zoom range of 55 feet to 
100,000 miles. It supports 18 map projections, per- 
forming map projection display on the fly (Korte, 
2001; McDonnell & Kemp, 1995; Ralston, 2004). 

Maplnfo ProServer enables Maplnfo Profes- 
sional to run on a server. This server side capability 
enables it to answer the queries over network en- 
abling users to access desktop mapping solutions 



throughout the enterprise. On the client site, Maplnfo 
Proserver and Maplnfo Professional can be handled 
using an Internet Web browser. 
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Landmark Information system has been designed 
mainly for petroleum companies in order to explore 
and manage gas and oil reservoirs. Its products 
include ARGUS and Geo-dataWorks (Korte, 2001; 
McDonnell & Kemp, 1995; Ralston, 2004). ARGUS 
is a generic petroleum common user interface that 
consists of a suite of data access tools for manage- 
ment levels and technical disciplines in an organiza- 
tion. Its logical data are independent and object- 
oriented. It combines Executive Information Sys- 
tems (EIS) with GIS features to query corporate 
data and display results virtually. Geo-data Works 
provides graphical project management. It enables 
graphical query and selection. It also allows the user 
to manage multiple projects. Furthermore, it pro- 
vides user-friendly management and query building 
capabilities. 



GIS APPLICATIONS 

GIS applications are increasing at an amazing rate. 
For instance, GIS is being used to assist businesses 
in identifying their potential markets and maintaining 
a spatial database of their customers more than ever 
before. Among the most popular applications of GIS 
are city and regional planning engineering. For ex- 
ample, water supply companies use GIS technology 
as a spatial database of pipes and manholes. Local 
governments also use GIS to manage and update 
property boundaries, emergency operations, and 
environmental resources. GIS also may be used to 
map out the provision of services, such as health 
care and primary education, taking into account 
population distribution and access to facilities. 

Firefighters can use GIS systems to track poten- 
tial damage along the path of forest fires. The Marin 
County Fire Department in Northern California de- 
ploys helicopters equipped with global positioning 
system (GPS) receivers to fly over an area of land 
that is ablaze. The receiver collects latitude and 
longitude information about the perimeter of the fire. 
When the helicopter lands, that information is down- 
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loaded into a PC, which then connects to a database 
containing information on land ownership, endan- 
gered species, and access roads within the area of 
the fire. Those maps are printed out on mobile 
plotters at the scene and distributed to firefighters. 

Preservation groups use GIS software to assess 
possible danger caused by environmental changes. 
In such a case, GIS is used as a tool for integrating 
data across borders and helping to bring people 
together to solve problems in an effective manner. 
For example, the U.S. National Oceanic and Atmo- 
spheric Administration and other organizations use 
GIS systems to create maps of the Tijuana River 
watershed, a flood-prone area that spans the border 
of San Diego (USA) and Tijuana (Mexico). The 
maps include soil and vegetation class structure 
from each city, which allows city and urban planners 
to see across the borders when predicting flood 
danger. 

GIS can be used efficiently as human-machine 
interactive (HMI) tools to realize land-parcel re- 
structuring within a well-defined zone, whose par- 
cels’ structure is found to be inadequate for agricul- 
tural or building purposes. Golay, et al. (2000) have 
proposed a new a prototype of an interactive engi- 
neering design platform for land and space-related 
engineering tasks, based on the real-time aggrega- 
tion of vector and raster GIS data. This concept 
allows engineers to get real-time aggregate values 
of a continuously defined spatial variable, such as 
land value, within structures they are reshaping. 

GIS tools can be used to manage land efficiently. 
Before the advent of GIS, land management ser- 
vices in public administration used to make decisions 
without analyzing related data properly. These days, 
such organizations are using GIS to properly manage 
land based on the data provided by GIS analysis. 
One example includes the decision support model 
called MEDUSAT, which proposes a structured 
application of GIS and multicriteria analysis to sup- 
port land management. MEDUSAT presents an 
original combination of these tools (Joerin et al., 
2001). In this context, GIS is used to manage infor- 
mation that describes the territory and offer spatial 
analysis schemes. The multicriteria analysis schemes 
then are used to combine such information and to 
select the most adequate solution, considering the 
decision maker’s preference (Jankowski, 1995; 
Joerin et al., 2001). 



GIS techniques can be used along with remote 
sensing schemes for urban planning, implementa- 
tion, and monitoring of urban projects. Such a com- 
bination has the capability to provide the needed 
physical input and intelligence for the preparation of 
basemaps and the formulation of planning proposals. 
They also can act as a monitoring tool during the 
implementation phase. Large-scale city and urban 
projects need decades to complete. Satellite images 
can be used to maintain a real record of terrain 
during this period. Clearly, GIS and remote sensing 
are powerful tools for monitoring and managing land 
by providing a fourth dimension to the city — time 
(Joerin et al., 2001; Kohsaka, 2001). 

GIS systems can be used to aid in urban eco- 
nomic calculations. A basic goal of urban economics 
is to analyze spatial relationships. This typically 
takes the shape of costs to ship customers, employ- 
ees, merchandise, or services between various loca- 
tions. Real estate planners often seek to quantify the 
relations between supply and demand for a particu- 
lar land type in a given geographical region. Spatial 
economic theory shows that transportation costs are 
the most essential factors that determine ease of 
access. GIS can estimate transportation costs in a 
better way by computing the distances along roads, 
weighting interstate roads less than local roads and 
adding delay factors for construction spots, tunnels, 
and so forth (Jankowski, 1995; Joerin et al., 2001). 

GIS can handle complex network problems, such 
as road network analysis. A GIS can work out travel 
times and shortest path between two cities (sites), 
utilizing one of the shortest path algorithms. This 
facility can be built into more complicated models 
that might require estimates of travel time, accessi- 
bility, or impedance along a route system. An ex- 
ample is how a road network can be used to calcu- 
late the risks of accidents. 

GIS can be used to model the flow of water 
through a river in order to plan a flood warning 
system. Real-time data would be transmitted by 
flood warning monitors/sensors, such as rain gauges 
and river height alarms, which could be received and 
passed to a GIS system to assess the hazard. If the 
amount and intensity of rain exceeds a certain limit, 
determined by the GIS flood model for the area, a 
flood protection plan could be put into operation with 
computer-generated maps demarcating the vulner- 
able areas at any point in time (Golay et al., 2000; 
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Jankowski, 1995; Joerin et al. , 200 1 ; Kohsaka, 200 1 ; 
Mohan, 200 1 ; Sadoun, 2003; Saleh & Sadoun, 2004). 



FUTURE TRENDS 

GIS systems are effective HCI tools that have 
received wide acceptance by organizations and in- 
stitutions. Many towns and cities worldwide have 
started or are in the process of using it in planning 
and engineering services and zones. Many vendors 
have developed GIS software packages of various 
features and options. Using GIS software is becom- 
ing more and more friendly with the addition of new 
features in the software such as the Graphical User 
Interfaces (GUIs), animation, validation, and verifi- 
cation features. GUI helps make the interface be- 
tween the developer/user and the computer easier 
and more effective. Animation helps to verify mod- 
els and plans, as a picture is worth a thousand words. 
The latter also makes the GIS package more salable. 
These days, there is a trend to use GIS for almost all 
applications that range from planning transportation 
systems in cities and towns to modeling of the flow 
of water through rivers in order to have a proper 
flood warning system, where real-time data can be 
sent by flood warning sensors, such as rain gauges 
and river height alarms, which could be received and 
passed to a GIS system to assess the hazard. 

During the past few years, law enforcement 
agencies around the world have started using GIS to 
display, analyze, and battle crime. Computer-gener- 
ated maps are replacing the traditional push-pin 
maps that used to cover the walls of law enforce- 
ment agencies. Such a tool has given police officers 
the power to classify and rearrange reams of data in 
an attempt to find patterns. In the United States, 
officers in many precincts are using this technology 
to track down crimes. Officers gather reports on 
offenses like car thefts or residential robberies onto 
weekly hot sheets, which then they enter into a 
computer that has a GIS software package. The GIS 
mapping program, in turn, relates each incident to 
some map information by giving it latitude and 
longitude coordinates within a map of the suspected 
area. 

Some police departments in developed countries 
employ GIS computer mapping to persuade resi- 
dents to get involved in community development. In 



North Carolina, police use ArcView to follow the 
relationships between illegal activities and commu- 
nity troubles like untidy lots, deserted trash, and 
shattered windows. 

In India, GIS technology has been applied to 
analyze spatial information for the environmental 
change implications, such as the work done for Delhi 
Ridge (Golay et al., 2000; Jankowski, 1995 ; Joerin et 
al., 2001; Mohan, 2001; Sadoun, 2003; Saleh & 
Sadoun, 2004). Integrated GIS approaches can pro- 
vide effective solutions for many of the emerging 
environmental change problems at local, regional, 
national, and global levels, and may become the 
preferred environment for ecological modeling. GIS 
tools have been used to monitor vegetation cover 
over periods of time to evaluate environmental con- 
ditions. The Delhi population has grown by about 
25% per year since 1901. Therefore, it is vital to 
predict the rate of increase of CO, emission and 
ozone depletion in such a heavily populated city. GIS 
technology can help a lot in predicting possible 
scenarios and environmental plans and solutions. 
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CONCLUSION 

GIS technology is an important human computer 
interactive (HCI) tool that has become vital to 
modern city and urban planning/engineering. Using 
GIS can produce cost-effective plans/designs that 
can be modified, tuned, or upgraded, as needed. GIS, 
as an HCI-based tool, is becoming an interdiscipli- 
nary information technology and information sci- 
ence field that has numerous applications to city and 
urban planning/engineering, ranging from land man- 
agement/zoning to transportation planning/engineer- 
ing. Engineering and computer science departments 
worldwide have started to offer undergraduate and 
graduate programs/tracks in this important, evolving 
discipline. These days, GIS is a must for modern city 
planning and engineering, since it can (a) streamline 
customer services in an interactive manner; (b) 
reduce land acquisition costs using accurate quanti- 
tative analysis; (c) analyze data and information in a 
speedy manner, which is important for quick and 
better decisions-making; (d) build consensus among 
decision-making teams and populations; (e) optimize 
urban services provided by local governments; (f) 
provide visual digital maps and illustrations in a much 
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more flexible manner than traditional manual auto- 
mated cartography approaches; and (g) reduce pol- 
lution and cost of running transportation means by 
finding the shortest paths to desired destinations. 
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KEY TERMS 

Cartography: The art, science, and engineering 
of mapmaking. 

City and Regional Planning/Engineering: The 

field that deals with the methods, designs, issues, and 
models used to have successful plans and designs 
for cities, towns, and regions. 

Coordinate System: A reference system used 
to gauge horizontal and vertical distances on a 
planimetric map. It usually is defined by a map 
projection, a spheroid of reference, a datum, one or 
more standard parallels, a central meridian, and 
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possible shifts in the x- and ^-directions to locate x 
y positions of point, line, and area features. For 
example, in ARC/INFO GIS system, a system with 
units and characteristics defined by a map projec- 
tion. A common coordinate system is used to spa- 
tially register geographic data for the same area. 

Data: A collection of attributes (numeric, alpha- 
numeric, figures, pictures) about entities (things, 
events, activities). Spatial data represent tangible 
features (entities). Moreover, spatial data are usu- 
ally an attribute (descriptor) of the spatial feature. 

Database Management Systems (DBMS): 

Systems that store, organize, retrieve, and manipu- 
late databases. 

Digital Map: A data set stored in a computer in 
digital form. It is not static, and the flexibility of 
digital maps is vastly greater than paper maps. 
Inherent in this concept is the point that data on 
which the map is based is available to examine or 
question. Digital maps can be manipulated easily in 
GIS package environments. 



information that can be analyzed to aid in problem 
solving and planning. Analytical tools in a geographic 
information system (GIS) are used for building spa- 
tial models. Models can include a combination of 
logical expressions, mathematical procedures, and 
criteria that are applied for the purpose of simulating 
a process, predicting an outcome, or characterizing 
a phenomenon. Shannon defined a model as “the 
process of designing a computerized model of a 
system (or a process) and conducting experiments 
with this model for the purpose either of understand- 
ing the behavior of the system or of evaluating 
various strategies for the operation of the system” 
(PP- 9-15). 
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Raster Analysis: Raster analysis implements 
its spatial relationships mainly on the location of the 
cell. Raster operations performed on multiple input 
raster data sets usually output cell values that are the 
result of calculations on a cell-by-cell foundation. 
The value of the output for one cell is usually 
independent on the value or location of other input or 
output cells. 



GIS: A computer system that permits the user to 
examine and handle numerous layers of spatial data. 
The system is intended to solve problems and inves- 
tigate relationships. The data symbolizes real-world 
entities, including spatial and quantitative attributes 
of these entities. 

GPS: Global Positioning System. GPS is a satel- 
lite-based navigation system that is formed from a 
constellation of 24 satellites and their ground sta- 
tions. GPS uses these satellites as reference points 
to calculate positions accurate to a matter of meters. 
Actually, with advanced forms of GPS, you can 
make measurements to better than a centimeter! 
These days, GPS is finding its way into cars, boats, 
planes, construction equipment, movie-making gear, 
farm machinery, and even laptop computers. 

Information Systems: Information systems are 
the means to transform data into information. They 
are used in planning and managing resources. 

Model: An abstraction of reality that is struc- 
tured as a set of rules and procedures to derive new 



Spatial Data: Represents tangible or located 
features, such as a river, a 1,000 by 1,000 meter lot 
in a grid, a campus, a lake, a river, or a road. 

Validation: Refers to ensuring that the assump- 
tions used in developing the model are reasonable in 
that, if correctly implemented, the model would 
produce results close to that observed in real sys- 
tems. Model validation consists of validating as- 
sumptions, input parameters and distributions, and 
output values and conclusions. 

Vectro Analysis: In vector analysis, all opera- 
tions are possible, because features in one theme are 
located by their position in explicit relation to existing 
features in other themes. The complexity of the 
vector data model makes for quite complex and 
hardware-intensive operations. 

Verification: Verification is the process of find- 
ing out whether the model implements the assump- 
tions considered. A verified computer program, in 
fact, can represent an invalid model. 
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INTRODUCTION 

Decision making in planning should consider state- 
of-the-art techniques in order to minimize the risk 
and time involved. Proper planning in developing 
countries is crucial for their economical recovery 
and prosperity. Proper database systems, such as 
the ones based on GIS, are a must for developing 
countries so that they can catch up and build effec- 
tive and interactive systems in order to modernize 
their infrastructures and to help improve the stan- 
dard of living of their citizens. The huge and fast 
advancement in computing and information technol- 
ogy make it easy for the developing countries to build 
their database infrastructures. GIS-technology is 
one of the best and fastest tools to build such 
systems, manage resources, encourage businesses, 
and help to make efficient and cost-effective deci- 
sions. 

For the purpose of a better informed decision 
making in planning the improvement of the Bank of 
Jordan in the city of Amman, Jordan, we had to build 
a database system and a digital map for the city of 
Amman, the Bank of Jordan, its branches in Amman, 
and all other banks and their branches in Amman. 
We used the popular Geomedia software to allow an 
interactive time-saving data management; to offer 
the ability to perform different analysis, including 
statistical ones; and to provide graphical geospatial 
results on maps. By using Geomedia software, we 
built many layers needed for the planning processes 
and mainly for the region of Amman due to the lack 
of available digital data in the area. Some layers 
concern the project and relate to the bank, such as 
the geographic distribution of the Bank of Jordan 
branches and its ATMs; and others for the compari- 
son, such as the geographic distribution of all other 
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banks, their branches, and ATMs in Amman. This is 
to allow the decision makers to compare with all 
competitive banks in Amman. Besides the geo- 
graphic location of all existing banks, important 
attribute data are provided for the Bank of Jordan in 
specific and all the other banks in general (Batty et 
al., 1994a, 1994b; Burroughet al., 1980; Doucette et 
al., 2000; Elmasri & Navathe, 2004; Goodchild, 
2003; Longley et al., 1999a, 1999b). 

BACKGROUND 

The Bank of Jordan started planning for new ATM 
sites in Amman using the traditional method and, at 
the same time, the GIS pilot project to support 
building a quick goespatial information infrastruc- 
ture that can assess in the decision-making process 
according to provided criteria, which can be inte- 
grated into the GIS analysis process. The real chal- 
lenge here is to build a digital database to introduce 
a complete digital map for Amman to help in the 
analysis process. 

Many layers for different purposes are created, 
including the country boundaries, governorates bound- 
aries, city districts and subdistricts, main and submain 
streets, blocks and city blocks, government organi- 
zations, commercial areas and trading centers with 
cinemas and theaters, commercial companies, insur- 
ance companies, restaurants, hotels, hospitals, gas 
stations, Jordan Bank branches layer, and the 
branches of all other banks with their ATMs in the 
city of Amman. 

The design of these layers is based on a specific 
GIS data model suited for this application. It is based 
on integrating SPOT image of Amman with many 
scanned paper maps that provide the needed infor- 
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mation. Moreover, integration of Geographical Posi- 
tioning System (GPS) data into our GIS system is 
implemented to create many layers required for the 
analysis. 

Once the geospatial database for the city and the 
banks is ready, the rest of the work is easy and 
flexible, and the planners can integrate their func- 
tions and conditions in no time and will be able to 
provide better decision making. Moreover, part of 
the data could be made public and accessible through 
the Web to help not only in locating the sites of 
ATMs but also in doing the banking interactions, 
which is a sort of human computer interaction mecha- 
nism as it is done in the developed countries (Batty 
etal., 1994a; Burroughet al., 1980; Goodchild, 2003; 
Longley et al., 1999a, 1999b). 



METHODOLOGY AND MODELING 

Using scanning, digitizing, and registration tech- 
niques as well as collected and available attributes of 
data, many layers were created for the database. 
Figure 1 illustrates the general procedure for creat- 
ing the needed layers. 

Many basic geospatial data layers were built (by 
feature) for the project, as follows: 

1 . Line features such as (street layers) highways, 
subways, roadways, and railways. 

2. Polygon features such as urban, circles, farms, 
gardens, Jordan Governorates, Amman dis- 
tricts, Amman subdistricts, and so forth. 

3. Point features such as banks and/or ATMs, 
restaurants, hotels, large stores, hospitals, sport 
clubs, cinemas (movie theaters), cultural and 
social centers, gas stations, and police stations. 



Figure 2 illustrates a descriptive diagram of the 
GIS data model creation, measuring, development, 
and implementation stages. 



G 



IMPLEMENTATION 

SPOT image is used as the registration reference 
frame for all scanned hardcopy maps, as indicated in 
the schematic GIS data model in Figure 2. The 
reference system is the Jordanian Transverse 
Mercator (JTM). Figure 3 illustrates the reference 
points in the registration process using the Geomedia 
software. 

Digitization is followed to create the polygon and 
line layers. Figure 4 shows a digital scanned map 
image while digitizing, and the drawn redline is the 
digitized line on the map. 

Figure 5 (parts a, b, and c) shows examples of the 
resulting layers (maps) using line features such as 
highways, subways, and roads digitized layers, re- 
spectively. 

Figure 6 (parts a, b, c, and d) shows examples of 
the resulting layers (maps) using polygon features 
such as district, subdistrict, urban, and governorate 
layers, consecutively. 

Figure 7 shows some of the created layers for 
banks, ATMs, hospitals, and gas station locations. 
Finally, imposing all of the previous layers, a final 
resulting map was made available to help in the 
decision-making process. Any kind of information 
could be provided from any layer to help in the 
planning, improvement, or finding of the location of 
an ATM, a new bank branch, and so forth. 



Figure 1. Overview of the GIS project procedure 
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Figure 2. The GIS model chart 




Figure 3. Image registration 
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Figure 4. Digitizing process; line is the digitized line on the tourist map 




Figure 5(a). Highways layer 




Figure 5(b). Subways layer 




Figure 5(c). Roadways layer 
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Figure 6(a). District layer 



Figure 6(b). Subdistrict layer 





Figure 6(c). Urban layer 



Figure 6(d). Governorates layer 





Figure 7(a). Bank of Jordan locations layer 



Figure 7(b). ATM locations for all banks layer 
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Figure 7(c). All banks locations layer 




Figure 7(e). Gas station locations layer 




Figure 7(d). Hospital locations layer 




efficient human interactive tools. All kinds of analy- 
sis could be conducted using the software ability, and 
the results will be geographically posted on maps 
related to the geographic site. Many analysis tech- 
niques could be used, such as thematic maps tech- 
niques (a map of all banks and business centers in 
Amman), classification techniques (classify by color 
at each subdistrict), querying techniques about the 
shortest path or the largest population or the smallest 
area (the distance between two branches of the 
bank), and buffering (circle buffer around the exist- 
ing banks to show the served area). Another impor- 
tant analysis that is possible using GIS is the statis- 
tical analysis. 

A case study as an application on the decision- 
making process and planning with GIS is to locate 
sites for ATMs in Amman, as we mentioned earlier. 
In order to integrate our location constraints in the 
decision-making process using the GIS data that we 
created, we first had to define our choice of criterion 
and then use it for our locations spotting. We used 
the bank selection criterion for an ATM location to 
satisfy the following: 
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1 . There should be an ATM at each branch of the 
Bank of Jordan. 

2. There should be an ATM at each subdistrict 
with a population of about 10,000 people or 
more (nearby trading centers, gas stations, 
supermarkets, and so forth). 

3. There should be an ATM at busy shopping 
areas full of stores, restaurants, hotels, or malls. 

4. There should be an ATM at popular gas sta- 
tions that are located on highways or busy 
streets. 

5. Finally, there should be ATMs at a 10-km 
distance from each other in populated areas 
(the maximum area coverage for an ATM is 
within a 5-km radius). 



RESULTS AND ANALYSIS 

By preparing all required geospatial data layers 
needed for the project, we can start the analysis, 
depending on the digital database we acquired. In 
building our GIS, we used the Geomedia software 
package to manage our database and to allow the 
needed analysis. This package is a good example of 



To implement the needed criteria in GIS domain, 
new layers have to be extracted in order to help in 
choosing the new ATM locations as follows: 

• A query about each subdistrict that has a 
population of about 10,000 or more has been 
conducted. For example, Al-Abdaly district in 
Amman has been chosen. It consists of four 
subdistricts with a total population of about 



247 





A GIS-Based Interactive Database System for Planning Purposes 



Figure 8(a). Abdali area with Bank of Jordan 
locations 




92,080 inhabitants; namely, Al-Shemesany , Al- 
Madeena, Al-Reyadeyah, Al-Lweebdeh, and 
Jabal Al-Hussein (see Figure 8(a)). 

Make another query to have a map for all busy 
commercial areas full of stores, restaurants, 
hotels, and important gas stations that are lo- 
cated on a highway or busy crowded streets.- 
Make a query.- Have a map that shows the 
distribution of all banks in the study area (Fig- 
ure 8(c)). 

Create a buffer zone layer around the existing 
banks, branches, and ATMs in order to have a 
map to show clearly the served areas (inside 
the circles). Our constraints bank service cov- 



Figure 8 (b). All Point Features of Interest Like 
Trade Stores, Restaurants, Hotels, Malls, Gas 
Station, etc. 




erage is within a circular area of a 5-km radius 
(see Figure 8(d)). 

Finally, by combining all of these conditions (map 
overlay), we can find the best ATMs at new loca- 
tions to satisfy the bank constraints. Figure 9 shows 
the suggested ATM locations in Amman. 

The results shown in Figure 9 for the new loca- 
tions match those that resulted from the study of the 
bank-planning department. This comparison is made 
to demonstrate to the directors of the bank and the 
public the capability, effectiveness, and accuracy of 
the GIS technology as a human computer interactive 
tool for engineering and planning purposes. The 



Figure 8(c). Bank distribution layer in Abdaly area 
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Figure 8(d). Circular buffer zone layer to show 
the areas covered by banking service 




geospatial interactive data are now available for 
further analysis and planning of projects in all areas 
such as real estate and loan grants applications. 



FUTURE TRENDS 

GIS technology is an effective, accurate, and eco- 
nomical HCI tool that has received wide acceptance 
by numerous organizations and institutions and for all 
kinds of applications. Many financial institutions, 
banks, towns and cities worldwide have started or 



are in the process of using it in planning and engi- 
neering services. Today, there are many GIS soft- 
ware packages of various features and options that 
are available commercially. Using GIS software has 
become friendlier with the addition of new features 
in the software such as the animation, verification 
and validation features, and graphical user inter- 
faces (GUIs). Verification helps to verify models in 
order to make sure that the model is a real represen- 
tation of real systems under analysis. Validation is 
used to make sure that the assumptions, inputs, 
distribution, results, outputs, and conclusions are 
accurate. GUI is designed and used to provide an 
effective and user-friendly computer interactive 
environment. Animation helps provide a visual way 
to verify models and to make the GIS package more 
salable and enjoyable for potential users and buyers. 

GIS technology is becoming a trendy technology 
for almost all applications that range from planning 
irrigation systems to planning ATM systems for 
banks and financial institutions. Recently, law en- 
forcement agencies around the world have started 
using GIS to display, analyze, and battle crime. 
Computer-generated maps are replacing the tradi- 
tional maps that used to cover the walls of law 
enforcement agencies. Such a tool has given police 
officers the power to classify and rearrange reams 
of data in an attempt to find patterns. Some police 
departments in developed countries employ GIS 
computer mapping to persuade residents to get 
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Figure 9. Proposed ATM locations in Amman City 
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involved in community development. In India, GIS 
technology has been applied to analyze spatial infor- 
mation for the environmental change implications 
such as the work done for Delhi Ridge (Mohan, 
2001; Joerin et al., 2001; Sadoun, 2003; Saleh & 
Sadoun, 2004; Golay etal., 2000; Jankowski, 1995). 

GIS technology has been used to provide effec- 
tive solutions for many of the emerging environmen- 
tal problems at local, regional, national, and global 
levels. For example, GIS has been used to monitor 
vegetation cover over periods of time in order to 
evaluate environmental conditions. It can be used to 
predicate the increase in CO, emission and ozone 
depletion in heavily populated cities such as Tokyo, 
New York, Mexico City, Chicago, Los Angeles, and 
so forth. GIS technology can help provide all possible 
solutions and outcomes for different scenarios and 
settings. 

Computer interactive tools such as GIS technol- 
ogy used to support decision-making activities can 
have different purposes: information management 
and retrieval, multi-criteria analysis, visualization, 
and simulation. The quality of human computer 
interface can be measured not only by its accuracy 
but also by its ease of use. Therefore, most state-of- 
the art GIS software tools seek to provide user- 
friendly interfaces using features such as the GUI 
and animation options. 

Many other software packages have started to 
add GIS-like functionality; spreadsheets and their 
improved graphics capabilities in handling 2-D maps 
and 3-D visualizations are examples of such a trend. 
Software is being divided up on the desktop into 
basic modules that can be integrated in diverse 
ways, while other software is becoming increasingly 
generic in that manner. GIS is changing as more 
functions are embodied in hardware. 



CONCLUSION, REMARKS, AND 
RECOMMENDATIONS 

The tremendous advancement in information and 
computer technologies has changed the way of 
conducting all aspects of our daily life. Due to the 
amazing progress in these technologies, planning for 



developing a country has become more accurate, 
economical, effective, and quantitative. The avail- 
ability of digital databases has helped in better 
decision making in urban planning. Such databases 
can help to convince the developed world to have 
business and trade with developing countries. Build- 
ing a digital database or a GIS system can help in all 
domains. GIS has become a vital tool to solve 
administrative, security, health, commercial, and 
trade matters. 

GIS technology offers an interactive and power- 
ful tool to help in decision making and planning. It 
offers answers on maps that are easy to visualize 
and understand. Moreover, the user/planner/engi- 
neer can build his or her database once and then later 
can easily alter it as needed in almost no time. It can 
be used interactively and reduce the paper maps and 
allow customization of all specific encountered prob- 
lems. Moreover, it allows great storage and fast 
access to data. Finally, the advancement of the 
World Wide Web has allowed public access to all 
needed information to help in better serving the 
world. The world is looking for standardization of 
database systems and centralization of the source to 
make it easy to find and use. 

It is worth mentioning that development needs 
quantitative planning and informed decision making 
using modern technologies such as a digital database 
and a GIS, as such technologies provide quick, 
accurate, interactive, and convincing plans and rec- 
ommendations. 

In this work, we present recommendations based 
on the analysis of collected digital data using GIS 
techniques for building the needed maps for city 
planning. A case study is considered here, which 
deals with finding the proper locations of ATMs for 
the Bank of Jordan located in the capital city of 
Jordan, Amman. The registration and digitization 
processes, the GPS measurements integration, and 
the implementation of statistical data such as popu- 
lation and other related information are presented. 
Our work also presents the criteria used in the spatial 
analysis for modeling the process of the ATM sites’ 
selection. Finally, the article provides a map of the 
new proposed ATM sites selected for the case 
study. 
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KEY TERMS 

Animation: A graphical representation of a simu- 
lation process. The major popularity of animation is 
its ability to communicate the essence of the model 
to managers and other key project personnel, greatly 
increasing the model’s credibility. It is also used as 
a debugging and training tool. 

ATM Systems: Automatic Teller Machines are 
installed by banks in different locations of the city or 
town in order to enable customers to access their 
bank accounts and draw cash from them. 

City and Regional Planning/Engineering: The 

field that deals with the methods, designs, issues, and 
models used to have successful plans and designs 
for cities, towns, and regions. 

Coordinate System: A reference system used 
to gauge horizontal and vertical distances on a 
planimetric map. It is usually defined by a map 
projection, a spheroid of reference, a datum, one or 
more standard parallels, a central meridian, and 
possible shifts in the x- and y-directions to locate x, 
y positions of point, line, and area features (e.g., in 
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ARC/INFO GIS system, a system with units and 
characteristics defined by a map projection). A 
common coordinate system is used to spatially reg- 
ister geographic data for the same area. 

Data: A collection of attributes (numeric, alpha- 
numeric, figures, pictures) about entities (things, 
events, activities). Spatial data represent tangible 
features (entities). Moreover, spatial data are usu- 
ally an attribute (descriptor) of the spatial feature. 

Database Management Systems (DBMS): 

Systems that store, organize, retrieve, and manipu- 
late databases. 

Digital Map: A digital map is a data set stored 
in a computer in digital form. It is not static, and the 
flexibility of digital maps is vastly greater than paper 
maps. Inherent in this concept is the point that data 
on which the map is based is available to examine or 
question. Digital maps can be manipulated easily in 
GIS package environments. 

Digital Satellite Images: Digital images sent 
by satellite systems that are usually launched in 
special orbits such as the geostationary orbit. The 
latter type of satellite systems rotate at about 35,000 
Km from the surface of the earth and is able to cover 
the same area of the earth 24 hours a day. 

Digitation: The process of converting analog 
data to digital data where binary systems are usually 
used. Programmers find dealing with digital data is 
much easier than dealing with analog data. 

GIS: A computer system that permits the user to 
examine and handle numerous layers of spatial data. 
The system is intended to solve problems and inves- 
tigate relationships. The data symbolize real-world 
entities, including spatial and quantitative attributes 
of these entities. 

GPS: Global Positioning System is a satellite- 
based navigation system that is formed from a 



constellation of 24 satellites and their ground sta- 
tions. GPS uses these satellites as reference points 
to calculate positions accurate to a matter of meters. 
Actually, with advanced forms of GPS, you can 
make measurements to better than a centimeter! 
These days, GPS is finding its way into cars, boats, 
planes, construction equipment, movie-making gear, 
farm machinery, and even laptop computers. 

Model: An abstraction of reality that is struc- 
tured as a set of rules and procedures to derive new 
information that can be analyzed to aid in problem 
solving and planning. Analytical tools in a geographic 
information system (GIS) are used for building spa- 
tial models. Models can include a combination of 
logical expressions, mathematical procedures, and 
criteria, which are applied for the purpose of simu- 
lating a process, predicting an outcome, or charac- 
terizing a phenomenon. Shannon defined a model as 
“the process of designing a computerized model of a 
system (or a process) and conducting experiments 
with this model for the purpose either of understand- 
ing the behavior of the system or of evaluating 
various strategies for the operation of the system” 
(pp. 9-15). 

Spatial Data: Spatial data represent tangible or 
located features such as a river, a 1,000 by 1,000 
meter lot in a grid, a campus, a lake, a river, a road, 
and so forth. 

Validation: Validation refers to ensuring that 
the assumptions used in developing the model are 
reasonable in that, if correctly implemented, the 
model would produce results close to that observed 
in real systems. Model validation consists of validat- 
ing assumptions, input parameters and distributions, 
and output values and conclusions. 

Verification: Verification is the process of find- 
ing out whether the model implements the assump- 
tions considered. A verified computer program, in 
fact, can represent and invalid model. 
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INTRODUCTION 

Globalization is a trend in the new industrial era. 
Global economy has seen a huge amount of product 
and technology exchanges all over the world. With 
the increase of export and resulting from that, with 
the increase of world-wide technical product ex- 
change, a product will now be used by several 
international user groups. As a result, there is an 
increasing number of user groups with different 
cultural features and different cultural-based user 
philosophies. All these user groups and philosophies 
have to be taken into account by a product developer 
of human machine systems for a global market. 

User requirements of product design have be- 
come much more valued than before because cul- 
tural background is an important influencing variable 
that represents abilities and qualities of a user (del 
Galdo & Nielsen, 1996). However, there is a gap in 
developers’ knowledge when handling product de- 
sign according to the culture-dependent user re- 
quirements of a foreign market (Rose & Ziihlke, 
2001), so the “ user-oriented ” product design has 
not always been fulfilled on the international market. 

BACKGROUND 

Usability is the key word to describe the design and 
engineering of usable products. The term describes 
also a systematic process of user-oriented design to 
engineer “easy-to-use” products (see ISO 13407, 
1999). One key element for success in this field is to 
know the target groups and their requirements. 
Hence, in time of globalization, usability experts 
have to integrate intercultural aspects into their 
approaches (see Rose, 2002). Therefore, usability 
experts have to know their target group and require- 
ments in this target culture. 

For a foreign market, localized design ( local 
design is for a specific culture and global design is 



for many cultures) is needed to address the target 
culture. “There is no denying that culture influences 
human-product interaction” (Hoft, 1996). This has 
caused a change in the design situation in a way that 
engineers nowadays have to face up to other user 
groups with different cultures, which they are not 
familiar with. It is now unrealistic for them to rely 
only on their intuition and personal experience gained 
from their own culture to cope with the localized 
design. Although, it is clear that cultural require- 
ments should be well addressed in localized designs. 

INTERCULTURAL HUMAN 
MACHINE SYSTEMS 

Day (1996) pointed out that we have to recognize 
that “any technology should be assessed to deter- 
mine its appropriateness for a given culture.” This 
implies that, in time of globalization as far as user- 
oriented design is concerned, it must also be culture- 
oriented. 

A good understanding of culture could provide 
designers with clues to answer these questions. A lot 
of cultural anthropologists and consultants have 
conducted many cultural studies and obtained plenty 
of cultural data, (e.g., Bourges-Waldegg & Scriv- 
ener, 1998; del Galdo, 1990; Honold, 2000; Marcus, 
1996; Prabhu & Harel, 1999; Rose, 2002). 

The Human Machine System [HMS] engineering 
process is influenced by the cultural context of 
the current developer and the former user. 
Developer will construct the future product for 
expected users. With the task and requirement 
analysis, he is be able to integrate his future 
user. The matching between developer and user 
model will influence the product and his 
construction. For the development of intercultural 
HMS, it means the following situation: developer 
from culture A has to construct/design a product 
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for the user in culture B. Therefore, it is important 
to mention this fact and to analyze the culture- 
specific user-requirements. Honold (2000) 
describes intercultural influence on the user’s 
interface engineering process in these main 
aspects: user-requirements, user interface design 
and user interface evaluation. In case of 
localization, it is necessary to know the specific 
user’s needs of the cultural-oriented product/ 
system. It is necessary to analyze the culture- 
specific user-requirements. Such an analysis is 
the basis for a culture-specific user interface 
design (see also ISO 13407, 1999). To get valid 
data from the evaluation of current systems or 
prototypes with a culture-specific user interface, 
an intercultural evaluation is necessary. Culture 
influences the whole user interface engineering 
process as well as the HMS engineering process. 
Through this influence, a management of 
intercultural usability engineering is necessary. 
This is a challenge for the future. 

Modern user-centered approaches include cul- 
tural diversity as one key aspect for user-friendly 
products (Rose, 2004). Liang (2003) has observed 
the multiple aspects of cultural diversity, the micro- 
view on the user and the macro-view on the engi- 
neering process. 

Technology has changed the ways people doing 
their activities and accelerated the trend of 
globalization. The consequences are the increase 
of cultural diversity embedded in the interaction 
and communication and the pervasiveness of 
interactive systems applied in almost every human 
activities ... Therefore, when we look at cultural 
issues in interactive systems, we should consider 
not only human activities supported by the systems 
but also the activities or processes of the design, 
the implementation and the use. (Liang, 2003) 

According to Rose (2002), the usage of intercul- 
tural user interfaces and intercultural human ma- 
chine systems describes the internationalization and 
localization of products, and excludes global prod- 
ucts. Intercultural human machine systems are 
defined as systems, where human and machine have 
the same target and the needed functions and infor- 
mation to reach the target are offered and displayed 



with ergonomic considerations based on ergonomic 
rules and guidelines. Beyond this, the intercultural 
human machine system takes into account the cul- 
tural diversity of users — according to culture-spe- 
cific user requirements — and specific technical fea- 
tures as well as frame or context requirements 
based on cultural specifics. Hence, the intercultural 
human machine systems offering needed functions 
and information to realize a user-oriented human 
machine system, which is optimized for the target 
user and the used application in his/her culture 
determine usage context (Rose, 2004). 

It has to be mentioned that there are cultural- 
based differences between user and developer. 
Therefore, the integration of cultural specifics is a 
natural tribute to the diversity of user and developer 
cultures in time of globalization. The mental model of 
a developer from Germany is mostly very different 
from the mental model of a developer in China. 
Differences between developers and users stem 
from differences of their implementation in a cultural 
context. 



FUTURE TRENDS 

New research or application fields offer new chal- 
lenges. Intercultural human machine system design 
is a new field with huge challenges. Smith and Yetim 
(2004) state the following: 

Effective strategies that address cultural issues 
in both the product and the process of information 
systems development now often are critical to 
system success. In relation to the product of 
development, cultural differences in signs, 
meanings, actions, conventions, norms or values, 
etc., raise new research issues ranging from 
technical usability to methodological and ethical 
issues of culture in information systems. In 
relation to the process of development, cultural 
differences affect the manner, in which users are 
able to participate in design and to act as subjects 
in evaluation studies. (Smith & Yetim, 2004) 

The field of intercultural human machine sys- 
tems, which is mainly mentioned in the research field 
of big global companies, is, in practice, a typical topic 
for global players. But the developers in different 
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application areas have knowledge gaps in this field. 
An analysis of developers’ requirements in produc- 
tion automation area has shown the following: more 
than half like to adapt an existing user system to 
foreign countries (internationalization) or to a spe- 
cific user culture (localization). Forty percent think 
the globalization problem is solved, and more than 
52% would be interested to support intercultural 
design, that is, by developing guidelines (Rose & 
Ziihlke, 2001). Therefore, developing intercultural 
human machine systems is still a challenge for the 
future. Resolving such a challenge would inform 
developers and help them focus more on cultural 
aspects of design and engineering of products and 
systems (Evers, Rose, Flonold, Coronado, & Day, 
2003). 

CONCLUSION 

This article has shown the importance of globaliza- 
tion and culture to usability. An introduction to cul- 
tural diversity in the context of human machine 
systems and a definition of intercultural human ma- 
chine systems were given. Usability experts need 
information about the target culture and the culture- 
specifics of the users. This information is a basis for 
culture- and user-oriented human machine system 
design. 

Ergonomics human machine systems that accept 
and reflect cultural diversity of the targeted users are 
strongly needed. 

Giving attention to users’ and developers’ needs 
in the context of globalization, usage, and engineering 
enable the creation of products and systems with high 
usability. 



REFERENCES 

Bourges-Waldegg, P., & Scrivener, S. A. R. (1998). 
Meaning, the central issue in cross-cultural HCI 
design. Interacting with Computers'. The Interdis- 
ciplinary Journal of Human-Computer-Interac- 
tion, 9(February), 287-309. 



Proceeding of HCI ’96 (pp. 35-47). London: 
Springer Verlag. 

del Galdo, E. M. (1990). Internationalization and 
translation: Some guidelines for the design of hu- 
man-computer interfaces. In J. Nielsen (Ed.), De- 
signing user interfaces for international use 
(pp. 1-10). Amsterdam: Elsevier Science Publish- 
ers. 



G 



del Galdo, E. M., & Nielsen, J. (1996). Interna- 
tional user interface. New York: John Wiley & 
Sons. 



Evers, V., Rose, K., Honold, P., Coronado, J., & 
Day, D. (Eds.) (2003, July). Designing for global 
markets 5. Workshop Proceedings of the Fifth 
International Workshop on Internationalisation 
of Products and Systems, IWIPS 2003, Berlin, 
Germany (pp. 17-19). 

Hoft, N. (1996). Developing a cultural model. In E. 
M. del Galdo & J. Nielsen (Eds), International 
user interfaces (pp. 41-73). New York: Wiley. 

Honold, P. (2000, July 13-15). Intercultural usability 
engineering: Barriers and challenges from a Ger- 
man point of view. In D. Day, E. M. del Galdo, & 
G. V. Prabhu (Eds.), Designing for global mar- 
kets 2. Second International Workshop on 
Internationalisation of Products and Systems 
(IWIPS 2 000), Baltimore (pp. 137-148). Backhouse 
Press. 



ISO 13407 (1999). Benutzer-orientierteGestaltung 
interaktiver Systeme (user-oriented design of inter- 
active systems). 

Liang, S. -F. M. (2003, August 24-29). Cross- 
cultural issues in interactive systems. In Ergonom- 
ics in the Digital Age. Proceedings of the Inter- 
national Ergonomics Association and the 7 th 
Join Conference of Ergonomics Society of Ko- 
rea/ Japan Ergonomics Society, Seoul, Korea. 

Marcus, A. (1996). Icon and symbol design issues 
for graphical user interfaces. In E. M. del Galdo & 
J. Nielsen (Eds.), International user interfaces 
(pp. 257-270). New York: John Wiley & Sons. 



Day, D. (1996). Cultural bases of interface accep- 
tance: Foundations. In M. A. Sasse, R. J. Cunningham, 
& R. L. Winder (Eds.), People and Computers XI: 



Prabhu, G., &Harel, D. (1999, August 22-26). GUI 
design preference validation for Japan and China — 
A case for KANSEI engineering? InH. -J. Bullinger 



255 




Globalization, Culture, and Usability 



& J. Ziegler (Eds.), Human-Computer Interac- 
tion: Ergonomics and User Interfaces, Vol. 1, 
Proceedings 8th International Conference on 
Human-Computer Interaction (HCI International 
’99), Munich, Germany (pp. 521-525). 

Rose, K. (2002). Methodik zur Gestaltung 
Interkultureller Mensch-Maschine-Systeme in der 
Produktionstechnik (Method for the design of 
intercultural human machine systems in the area of 
production automation). Dissertationsschrift zur 
Erlangung des Akademischen Grades, Doktor- 
Ingenieur’ (Dr.-Ing.) imFachbereichMaschinenbau 
und Verfahrenstechnik der Universitat 
Kaiserslautern, Fortschritt-Bericht pak, Nr. 5, 
Universitat Kaiserslautern. 

Rose, K. (2004). The development of culture-ori- 
ented human-machine systems: Specification, analysis 
and integration of relevant intercultural variables. In 
M. Kaplan (Ed.), Cultural ergonomics . Published 
in the Elsevier Series, Advances in Human Perfor- 
mance and Cognitive Engineering Research, Vol. 
2 (pp. 61-103). Lawrence Erlbaum Associates, 
Oxford: Elsevier, UK. 

Rose, K., & Ziihlke, D. (2001, September 18-20). 
Culture-oriented design: Developers’ knowledge gaps 
in this area. In G. Johannsen (Ed), Preprints of 8 th 
IFAC/IFIP/IFORS/IEA Symposium on Analysis, 
Design, and Evaluation of Human-Machine Sys- 
tems, Kassel, Germany (pp. 11-16). 

Smith, A., & Yetim, F. (2004). Global human-com- 
puter systems: Cultural determinants of usability. 
Interacting with Computers, The Interdiscipli- 
nary Journal of Human-Computer-Interaction, 
76(January), 1-5. Special Issue, Global human- 
computer systems: Cultural determinants of us- 
ability. 



KEY TERMS 

Culture: Common meaning and values of a 
group. Members of such a group share and use the 
accorded signs and roles as a basis for communica- 
tion, behaviour, and technology usage. Mostly, a 
country is used as a compromise to refer or define 
rules and values, and is used often as a synonym for 
user’s culture. 

Culture-Oriented Design: Specific kind of user- 
oriented design, which focuses on the user as a central 
element of development, and also takes into account 
the cultural diversity of different target user groups. 

Globalization: As one of three degrees of inter- 
national products, it means a “look like” culture-less 
international standard for use in all markets (in 
accordance with Day, 1996). 

Human Machine System (HMS): Based on 
the acceptance of an interaction between human 
and machine, it is a summary of all elements of the 
hard-, soft- and useware. The term includes the 
micro (UI) and macro (organization) aspects of a 
human machine system. 

Intercultural Human Machine System: The 

intercultural human machine system takes into ac- 
count the cultural diversity of human (different user 
requirements) and machine (variation of usage situ- 
ations), in addition to a standard HMS. 

Internationalization: As one of three degrees 
of international products, it means a base structure 
with the intent of later customizing and with struc- 
tural and technical possibilities for it (in accordance 
with Day, 1996). 

Localization: As one of three degrees of inter- 
national products, it means a developing of culture 
specific packages for a particular (local) market (in 
accordance with Day, 1996). 

User-Oriented Design: Development approach 
with a focus on users’ requirements and users’ 
needs as basis for a system or product development. 
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INTRODUCTION 

Computer-Supported Cooperative Work (CSCW) is 
largely an applied discipline, technologically sup- 
porting multiple individuals, their group processes, 
their dynamics, and so on. CSCW is a research 
endeavor that studies the use of, designs, and evalu- 
ates computer technologies to support groups, orga- 
nizations, communities, and societies. It is interdisci- 
plinary, marshalling research from different disci- 
plines such as anthropology, sociology, organiza- 
tional psychology, cognitive psychology, social psy- 
chology, and information and computer sciences. 
Some examples of CSCW systems are group deci- 
sion support systems (e.g., Nunamaker, Dennis, 
Valacich, Vogel, & George, 1991), group authoring 
systems (e.g., Guzdial, Rick, & Kerimbaev, 2000), 
and computer- mediated communication systems (e.g., 
Sproull & Kiesler, 1991). 

Behavioral and social sciences provide a rich 
body of research and theory about principles of 
human behavior. However, researchers and devel- 
opers have rarely taken advantage of this trove of 
empirical phenomena and theory (Kraut, 2003). 
Recently, at the 2004 Conference on CSCW, there 
was a panel discussion chaired by Sara Kiesler 
(Barley, Kiesler, Kraut, Dutton, Resnick, & Yates, 
2004) on the topic of incorporating group and orga- 
nization theory in CSCW. Broadly speaking, the 
panel discussed some theories applicable to CSCW 
and debated their usefulness. 

In this article, we use the theory of small groups 
as complex systems from social psychology in a 
brief example to allude to how it can be used to 
inform CSCW methodologically and conceptually. 



BACKGROUND 

Preaching to the choir, Dan Shapiro at the 1994 
Conference on CSCW made a strong call for a 
broader integration of the social sciences to better 
understand group- and organizational-level com- 
puter systems (Shapiro, 1994). Shapiro contrasted 
his proposal with the dominant use of 
ethnomethodology in CSCW research. As he noted, 
ethnomethodology implies a commitment to a 
worldview in which theories and other abstractions 
are rejected. Therefore, ethnographic accounts of 
behavior are driven not by explanation but “by the 
stringent discipline of observation and description” 
(p. 418). The result has been perhaps excellent 
designs, but typically, there is little sustained work to 
develop first principles that can be applied else- 
where (Barley et al., 2004). 

Finholt and Teasley (1998) provided evidence of 
Shapiro’ s concern by analyzing citations in the ACM 
Proceedings of the Conference on CSCW. For 
example, examination of the 162 papers that ap- 
peared between 1990 and 1996 showed that each 
conference had a small number of papers with a 
psychological orientation. Overall, however, the pro- 
ceedings indicated only modest attention to psycho- 
logical questions, and this attention is diminishing. 
For instance, 77 out of 695 citations referenced the 
psychological literature in the 1990 Proceedings. By 
1996, despite a 34% increase in the total number of 
citations, the number of references to the psycho- 
logical literature decreased by 39% to 46 out of 933 
citations. Thus, based on this study, the authors 
argue that the CSCW community should adopt a 
stronger orientation to social science disciplines. 
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Greater attention to psychological literature will 
offer well-validated principles about human behav- 
ior in group and organizational contexts, and convey 
data collection and analysis methods that identify 
salient and generalizable features of human behav- 
ior (Finholt & Teasley, 1998). 

Kraut (2003, p. 354) warns of “disciplinary in- 
breeding”, where researchers tend to cite work 
within their own community. For instance, contrast- 
ing with the earlier numbers on the decrease of 
citations to psychological literature, citations to the 
CSCW literature grew from 70 in 1990 to 233 in 
1996, an increase of 330% (Finholt & Teasley, 
1998). Kraut argues that unlike theories in cognitive 
psychology, social psychology as a theoretical base 
has been inadequately mined in the HCI and CSCW 
literatures (Barley et al., 2004). Part of the reason is 
the mismatch of goals and values of CSCW research 
with those of social psychology. CSCW is primarily 
an engineering discipline, whose goal is problem 
solving; in contrast, social psychology views itself as 
a behavioral science, whose mission is to uniquely 
determine the causes for social phenomena. 

EXAMPLE 

Social psychology has a rich body of theoretical 
literature that CSCW can build on (Beenen et al., 
2004; Farooq, Singley, Fairweather, & Lam, 2004; 
Kraut, 2003). Let us take an example of a theory 
from social psychology that entrains implications for 
CSCW. Consider the theory of small groups as 
complex systems (for details of the theory, refer to 
Arrow, McGrath, & Berdahl, 2000). According to 
the theory, groups are intact social systems embed- 
ded within physical, temporal, socio-cultural, and 
organizational contexts. Effective study of groups 
requires attention to at least three system levels: 
individual members, the group as a system, and 
various layers of embedding contexts — both for the 
group as an entity and for its members. The follow- 
ing social psychological study illustrates how this 
theory can be leveraged in CSCW. 

In the 1971 Stanford Prison Experiment (Bower, 
2004), Zimbardo randomly assigned male college 
students to roles as either inmates or guards in a 
simulated prison. Within days, the young guards 



were stripping prisoners naked and denying them 
food. Zimbardo and his colleagues concluded that 
anyone given a guard’s uniform and power over 
prisoners succumbs to that situation’s siren call to 
abuse underlings. Currently, the validity and conclu- 
sions of these studies are being challenged on the 
grounds that the study used artificial settings and 
abuses by the guards stemmed from subtle cues 
given by experimenters (p. 106). In a recent and 
similar study to explore the dynamics of power in 
groups, Haslam and Reicher (2003) are indicating 
that tyranny does not arise simply from one group 
having power over another. Group members must 
share a definition of their social roles to identify with 
each other and promote group solidarity. In this 
study, volunteers assigned to be prison guards had 
trouble wielding power because they failed to de- 
velop common assumptions about their roles as 
guards. “It is the breakdown of groups and resulting 
sense of powerlessness that creates the conditions 
under which tyranny can triumph,” (p. 108) Haslam 
holds. 

In light of the above-mentioned study, the theory 
of groups as complex systems has at least two 
implications for CSCW. First, the theory warrants a 
research strategy that draws on both experimental 
and naturalistic traditions (Arrow et al., 2000). This 
will allow researchers to mitigate for difficulties of 
both laboratory experiments (e.g. , lack of contextual 
realism) and field studies (e.g., lack of 
generalizability) . Such a theory-driven research strat- 
egy can enrich current evaluation techniques in 
CSCW by increasing methodological robustness and 
validation (e.g., Convertino, Neale, Hobby, Carroll, 
& Rosson, 2004). 

Second, the theory sheds light on the dynamics of 
power in a group. Arrow et al. (2000) assert that 
negotiations among members about power needs 
and goals typically involve both dyadic struggles to 
clarify relative power and collective norms about the 
status and influence structure (this was corrobo- 
rated by Haslam and Reicher, 2003). This entails 
design implications for CSCW. Drawing on Arrow 
et al.’s (2000) theory, CSCW systems should then 
support, in general, design features that allow the 
fulfillment of group members’ needs for attaining 
functional levels of agreement, explicit or implicit, 
regarding the following: (1) How membership status 
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is established within the group (e.g., determined by a 
leader, based on the value of contributions to group 
projects, based on seniority); (2) the degree of power 
disparity between members allowed by the group; 
and (3) the acceptable uses of power to influence 
others in the group and how to sanction violations on 
these norms. 



FUTURE TRENDS 

CSCW is going to become increasingly inter-disci- 
plinary. CSCW does and will continue to provide 
cultivating context for conflating multiple disciplines. 
To this end, marshalling theoretical literature from 
these disciplines will create a dynamic interplay in 
CSCW between theory, design, and practice. As 
CSCW co-evolves with its sister disciplines, we 
foresee a dialectical process of the former leveraging 
theoretical underpinnings of the latter, developing 
and refining these further, and in turn, also informing 
and enriching the theoretical base of its sister disci- 
plines in context of collaborative technology and 
work. 



CONCLUSION 

To avoid producing unusable systems and badly 
mechanizing and distorting collaboration and other 
social activity, it is imperative to address the chal- 
lenge of CSCW’s social-technical gap : the divide 
between what we know we must support socially and 
what we can support technically (Ackerman, 2000). 
An understanding of this gap lies at the heart of 
CSCW’ s intellectual contribution that can be realized 
by fundamentally understanding the theoretical foun- 
dations of how people really work and live in groups, 
organizations, communities, and other forms of col- 
lective life (Ackerman, 2000). We certainly hold the 
view that science does and must continually strive to 
bring theory and fact into closer agreement (Kuhn, 
1962). Perhaps the first step and challenge in this 
direction is to bridge the prescriptive discourse of 
CSCW with the descriptive disposition of the social 
sciences in order to arrive at scientifically satisfying 
and technically effective solutions. 
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KEY TERMS 

Computer Supported Cooperative Work 
(CSCW): Research area that studies the design, 
evaluation, and deployment of computing technolo- 
gies to support group and organizational activity. 



Complex Systems: This concept is borrowed 
from Complexity Theory (see Arrow et al. , 2000, for 
a detailed discussion). Complex systems are sys- 
tems that are neither rigidly ordered nor highly 
disordered. System complexity is defined as the 
number and variety of identifiable regularities in the 
structure and behaviour of the group, given a de- 
scription of that group at a fixed level of detail. Given 
this definition, Arrow et al. (2000) suggest that 
groups tend to increase in complexity over time, i.e. 
the number and variety of patterned regularities in 
the structure and behaviour of the group increase 
over time. 

Disciplinary Inbreeding: A phrase coined by 
Kraut (2003, p. 354) to refer to the phenomenon of 
researchers citing academic work within their own 
community. He used it in specific context of the 
CSCW community. 

Ethnomethodology: In the context of work, it is 
an approach to study how people actually order their 
working activities through mutual attentiveness to 
what has to be done. Ethnomethodology refuses any 
epistemological or ontological commitments, and 
limits its inquiry to what is directly observable and 
what can be plausibly inferred from observation. 

Small Groups: According to the theory of small 
groups as complex systems (Arrow et al., 2000), a 
group is a complex, adaptive, dynamic, coordinated, 
and bounded set of patterned relations among mem- 
bers, task, and tools. Small groups generally have 
more than one dyadic link (although a dyad can 
comprise a group) but are bounded by larger collec- 
tives (e.g., an organization). 
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INTRODUCTION 

South Africa is a multi-lingual country with a popu- 
lation of about 40.5 million people. South Africa has 
more official languages at a national level than any 
other country in the world. Over and above English 
and Afrikaans, the eleven official languages include 
the indigenous languages: Southern Sotho, Northern 
Sotho, Tswana, Zulu, Xhosa, S wati, Ndebele, Tsonga, 
and Venda (Pretorius & Bosch, 2003). Figure 1 
depicts the breakdown of the South African official 
languages as mother tongues for South African 
citizens. 

Although English ranks fifth (9%) as a mother 
tongue, there is a tendency among national leaders, 
politicians, business people, and officials to use 
English more frequently than any of the other lan- 
guages. In a national survey on language use and 
language interaction conducted by the Pan South 
African Language Board (Language Use and Board 
Interaction in South Africa, 2000), only 22% of the 
respondents indicated that they fully understand 
speeches and statements made in English, while 
19% indicated that they seldom understand informa- 
tion conveyed in English. 

The rate of electrification in South African is 
66.1%. The total number of people with access to 



electricity is 28.3 million, and the total number of 
people without access to electricity is 14.5 million 
(International Energy Agency, 2002). Although the 
gap between the “haves” and “have-nots” is nar- 
rowing, a significant portion of the South African 
population is still without the basic amenities of life. 

This unique environment sets the tone for a 
creative research agenda for HCI researchers and 
practitioners in South Africa. 

BACKGROUND 

E-Activities in South Africa 

SA has been active in the e-revolution. The South 
African Green Paper on Electronic Commerce (EC) 
(Central Government, 2000) is divided into four 
categories. Each category contains key issues or 
areas of concern that need serious consideration in 
EC policy formulation: 

• the need for confidence in the security and 
privacy of transactions performed electroni- 
cally; 

• the need to enhance the information infrastruc- 
ture for electronic commerce; 



Figure 1. Mother-tongue division as per official language (n = 40.5 million speakers) 



Venda 2%- 
Ndebele 2%q 
Swati 3% - 
Tsonga 4%- 
Tswana 7%- 




English 9% 



Zulu 22% 



Xhosa 18% 



Afrikaans 1 6% 



□ 


Zulu 22% 




□ 


Xhosa 18% 




□ 


Afrikaans 1 6% 




□ 


Northern Sotho 


10% 


■ 


English 9% 




□ 


Southern Sotho 


7% 


m 


Tswana 7% 




□ 


Tsonga 4% 




■ 


Swati 3% 




□ 


Ndebele 2% 




□ 


Venda 2% 





Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited. 




HCI in South Africa 



• the need to establish rules that will govern 
electronic commerce; 

• the need to bring the opportunities of e-com- 
merce to the entire population. 

EC has not only affected government but has also 
actively moved into the mainstream South African’s 
economy. Sectors of the economy that are using this 
technology are listed in Table 1 , along with examples 
of companies using EC in each sector. 

Electronic Communications and 
Transactions Bill 

The Electronic Communications and Transactions 
Bill (2002) is an attempt by the Republic of South 
Africa to provide for the facilitation and regulation of 
electronic communications and transactions; to pro- 
vide for the development of a national e-strategy for 
the Republic; to promote universal access to elec- 
tronic communications and transactions and the use 
of electronic transactions by small, medium and 
micro enterprises (SMMEs); to provide for human 
resource development in electronic transactions; to 
prevent abuse of information systems; and to en- 
courage the use of e-government services, and 
provide for matters connected therewith. 

Some provisions of the bill are specifically di- 
rected at making policy and improving function in 
HCI-related areas. These are elucidated in the 
following bulleted items : 



To promote universal access primarily in un- 
der-serviced areas. 

To remove and prevent barriers to electronic 
communications and transactions in the Re- 
public. 

To promote e-government services and elec- 
tronic communications and transactions with 
public and private bodies, institutions, and citi- 
zens. 

To ensure that electronic transactions in the 
Republic conform to the highest international 
standards. 

To encourage investment and innovation in 
respect of electronic transactions in the Re- 
public. 

To develop a safe, secure, and effective envi- 
ronment for the consumer, business and the 
government to conduct and use electronic trans- 
actions. 

To promote the development of electronic trans- 
action services, which are responsive to the 
needs of users and consumers. 

To ensure that, in relation to the provision of 
electronic transactions services, the special 
needs of particular communities and areas, and 
the disabled are duly taken into account. 

To ensure compliance with accepted interna- 
tional technical standards in the provision and 
development of electronic communications and 
transactions. 



Table 1. Sectors of the SA economy using EC, companies using EC within those sectors and their URLs 



Sector 


Company 


URL 


Banking-retail 


ABSA 


http://www.absa.co.za 


Finance 


SA Home Loans 


http://www.sahomeloans.com/ 


Insurance 


Liberty Life 


MyLife.com 


Media 


Independent Newspapers Online 


http://www.iol.co.za 


Retail 


Pick 'n Pay 


http://www.pnp.co.za/ 


Travel 


SAA 


Kulula.com 


Recruitment 


Career Junction 


http://www.careerjunction.co.za 


Mining 


Mincom 


http://www.mincom.com 


Automotive 


Motoronline 


http://www.motoronline.co.za 


Data/telecomm 


M-Web 


http://www.mweb.co.za/ 


Health 


Clickatell 


http://www.clickatell.co.za 
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Though these objectives are Utopian, they are 
the first steps towards developing a manageable 
framework for the sustainable development of the e- 
community in South Africa. It is only by actions like 
these that we can get active role players involved in 
the development of a strategy for the e-community in 
South Africa. 



THE STATE OF THE INTERNET 
IN SA 

Internet access in South Africa continues to grow 
each year, but the rate of growth has slowed signifi- 
cantly. According to one study on Internet access in 
SA (Goldstuck, 2002), only 1 in 15 South Africans 
had access to the Internet at the end of last year. By 
the end of 2002, Internet access will have improved 
only marginally, to 1 in 14 South Africans. According 
to the report, the slow growth is largely the result of 
delays in licensing a second network operator, 
Telkom’s own uncompromising attitude towards 
Internet service providers and market ignorance 
about the continued value of the Internet in the wake 
of the technology market crash of 2000 and 2001. 
South Africa will continue to lag behind the rest of the 
world in Internet use until the local telecommunica- 
tion climate is more favorable (Worthington-Smith, 
2002). 

As Goldstuck (2002) points out, the educational 
environment in particular is poised for a boom in 
access, with numerous projects under way to con- 
nect schools to the Internet. That will not only be a 
positive intervention in the short term, but will provide 
a healthy underpinning for the long-term growth of 
Internet access in South Africa. 



HCI IN SA 

HCI is a discipline concerned with the design, evalu- 
ation, and implementation of interactive computer 
systems for human use. It also includes the study of 
phenomena surrounding interactive systems. Ulti- 
mately, HCI is about making products and systems 
easier to use and matching them more closely to 
users’ needs and requirements. HCI is a highly active 
area for R&D and has applications in many coun- 
tries. Concern has been expressed by some that SA 



is lagging seriously behind in this area. For instance, 
there is the worry that SA is not meeting the special 
challenges resulting from our multi-lingual society 
or the need to enable people with low levels of 
literacy to enjoy the benefits afforded by ICT. In 
short, there is a concern that the average South 
African will not gain value from rolling out current 
e-government initiatives. The opportunity for SA 
researchers and developers lies not just in meeting 
the specific needs of the country, but in positioning 
themselves as leaders in HCI in developing coun- 
tries (Miller, 2003). According to Hugo (2003), the 
South African Usability community is small in size 
but large in quality. There are only a handful of 
practitioners, but their work is making a big impact 
on the local IT industry. 

A number of educators are introducing HCI into 
the computer science curricula, and practitioners 
are working with application developers to inte- 
grate user-centered design (UCD) into the overall 
product development life cycle. In addition, the 
local HCI special interest group (CHI-SA, the 
South African ACM SIGCHI Chapter) is actively 
working with several stakeholders, including aca- 
demics and non-government organizations to raise 
awareness of the impact of UCD on usability and 
overall organizational performance. In a few orga- 
nizations, important progress has been made to- 
wards institutionalizing UCD. 

Special attention is being given in South Africa to 
the relationship between multi-culturalism and tech- 
nology dissemination. HCI practitioners are paying 
special attention to the processes of enculturation, 
acculturation, and cultural identities in the localiza- 
tion of software. They have also recognized the 
need to encourage developers to understand the 
role of the many cultural factors at work in the 
design of computing products, as well as the issues 
involved in intercultural and multi-cultural software 
design and internationalization, and how they affect 
the bottom line for organizations. 

The usability community, which essentially con- 
sists of members of the CHI-SA, is currently a 
group of 1 16 people, 16 of whom are also members 
of ACM SIGCHI. After only one year as an official 
SIGCHI Chapter, CHI-SA distinguished itself in 
2002 by presenting a workshop on multi-cultural 
HCI at the Development Consortium of CHI 2002 
in Minneapolis. 
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Hugo (as reported by Wesson and Van Greunen, 
2003) further characterized HCI in S A by highlight- 
ing the following: 

• A shortage of qualified practitioners and edu- 
cators. There are only a few people involved in 
HCI teaching and research with the majority 
being academics. 

• A lack of awareness and implementation at 
industry level. 

• Isolation and fragmentation between academia, 
industry, private research, development, and 
government. 

• A lack of resources and inadequate training 
can result in inappropriate guidelines being 
adopted for the literature. 

• A lack of knowledge of standards for usability 
andUCD such as ISO 9241 and ISO 13407 that 
exist in industry. 

There is little evidence of SA commerce and 
industry taking up the principles and practice of 
HCI (Miller, 2003). Bekker (2003), a usability 
consultant with Test and Data Services, has no- 
ticed the following issues relating to HCI in the S A 
industry: 

• Most of her clients use the Rational Unified 
Process and Microsoft Solutions Framework 
methodologies. 

• Her clients are new to the concepts of HCI. 

• SA companies are not designing usable soft- 
ware. 

• People at the heart of the development process 
(the project managers and developers) fail to 
see the benefits of usability, and tend to ignore 
it (“I think it is because they hardly ever have 
contact with the users and have no idea how 
the users experience their system. They are 
highly computer literate people and cannot 
understand that anyone can battle using their 
system”). 

• Developers do not fully understand the nature 
of the users, which frustrate them. 

• It is mostly the marketing and business depart- 
ments of companies that request usability work 
from her. 



FUTURE TRENDS 

The Internet is becoming a significant tool in the SA 
economy but much more must be done to improve 
and broaden its use in SA. Research on HCI in SA 
is vital for both academics and practitioners. Sev- 
eral universities, notably, the University of South 
Africa, University of Cape Town, University of 
Kwa-Zulu Natal, University of Pretoria, University 
of Free State, and the Nelson Mandela Metropoli- 
tan University, are involved in teaching HCI. The 
Nelson Mandela Metropolitan University is the 
only university with a usability laboratory; the Uni- 
versity of South Africa is in the process of building 
a usability laboratory and another usability labora- 
tory. 

CONCLUSION 

Miller (2003) reports several barriers that are hin- 
dering the development of HCI in SA, such as: 

• There is minimal awareness or a lack of appre- 
ciation within practitioners ’ community of what 
HCI is about and the advantages of better 
usability. 

• There is a lack of HCI culture in the country, 
and therefore a lack of appreciation of its 
commercial benefits in industry. 

• There is a lack of collaboration between the 
research and practitioner communities. 

• There are concerns that the National Research 
Foundation (NRF) review panel lacks adequate 
HCI expertise. 

• Only a small number of researchers are inter- 
ested in HCI; ICT-related industry bodies are 
fragmented. 

• There is a reliance on overseas provision of 
computer hardware such as high-resolution 
screens, and so forth. 

• Government has offered little support other 
than the NRF Grant System and the Innova- 
tion Fund, which provide only low levels of 
funding. 
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E-Government: Is about re-engineering the 
current way of doing business, by using collaborative 
transactions and processes required by government 
departments to function effectively and economi- 
cally, thus improving the quality of life for citizens 
and promoting competition and innovation. To put it 
simply, e-government is about empowering a 
country’s citizens. 

Electronic Commerce: Uses some form of 
transmission medium through which exchange of 
information takes place in order to conduct business. 

ISO 13407: This standard provides guidance on 
human-centered design activities throughout the life 
cycle of interactive computer-based systems. It is a 
tool for those managing design processes and pro- 
vides guidance on sources of information and stan- 
dards relevant to the human-centered approach. It 
describes human-centered design as a multi-disci- 
plinary activity, which incorporates human factors 
and ergonomics knowledge and techniques with the 
objective of enhancing effectiveness and efficiency, 
improving human working conditions, and counter- 
acting possible adverse effects of use on human 
health, safety, and performance. 

ISO 9241-11: A standard that describes ergo- 
nomic requirements for office work with visual 
display terminals. This standard defines how to 
specify and measure the usability of products, and 
defines the factors that have an effect on usability. 

NRF: National Research Foundation, a govern- 
ment national agency that is responsible for promot- 
ing and supporting basic, applied research as well as 
innovation in South Africa. 

Usability: The ISO 9241- 11 standard definition 
for usability identifies three different aspects: (1) a 
specified set of users, (2) specified goals (asks) 
which have to be measurable in terms of effective- 
ness, efficiency, and satisfaction, and (3) the context 
in which the activity is carried out. 
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QUALITY OF INTERACTIVE 
PRODUCTS 

Human-computer interaction (HCI) can be defined 
as a discipline, which is concerned with the design, 
evaluation and implementation of interactive com- 
puting systems [products] for human use (Hewett et 
al, 1996). Evaluation and design require a definition 
of what constitutes a good or bad product and, thus, 
a definition of interactive product quality (IPQ). 
Usability is such a widely accepted definition. ISO 
9241 Part 1 1 (ISO, 1998) defines it as the “extent to 
which a product can be used by specified users to 
achieve specified goals with effectiveness, effi- 
ciency and satisfaction in a specified context of 
use.” 

Although widely accepted, this definition’ s focus 
on tasks and goals, their efficient achievement and 
the involved cognitive information processes repeat- 
edly caused criticism, as far back as Carroll and 
Thomas’ (1988) emphatic plea not to forget the 
“fun” over simplicity and efficiency (see also Carroll, 
2004). 

Since then, several attempts have been made to 
broaden and enrich HCI’s narrow, work-related 
view on IPQ (see, for example, Blythe, Overbeeke, 
Monk, & Wright, 2003; Green & Jordan, 2002; 
Helander & Tham, 2004). The objective of this 
article is to provide an overview of HCI current 
theoretical approaches to an enriched IPQ. Specifi- 
cally, needs that go beyond the instrumental and the 
role of emotions, affect, and experiences are dis- 
cussed. 



BACKGROUND 

Driven by the requirements of the consumer’s prod- 
uct market, Logan (1994) was first to formulate a 
notion of emotional usability, which complements 



traditional, “behavioral” usability. He defined emo- 
tional usability as “the degree to which a product is 
desirable or serves a need beyond the [...] func- 
tional objective” (p. 61). It is to be understood as “an 
expanded definition of needs and requirements, such 
as fun, excitement and appeal” (Logan, Augaitis, & 
Renk, 1994, p. 369). Specifically, Logan and col- 
leagues suggested a human need for novelty, change, 
and to express oneself through objects. 

Other authors proposed alternative lists of needs 
to be addressed by an appealing and enjoyable 
interactive product. In an early attempt, Malone 
(1981, 1984) suggested a need for challenge, for 
curiosity, and for being emotionally bound by an 
appealing fantasy (metaphor). Jordan (2000) distin- 
guished four groups of needs: physiological (e.g., 
touch, taste, smell), social (e.g., relationship with 
others, status), psychological (e.g., cognitive and 
emotional reactions), and Id-needs (e.g., aesthetics, 
embodied values). Gaver and Martin (2000) com- 
piled a list of non-instrumental needs, such as nov- 
elty, surprise, diversion, mystery, influencing the 
environment, intimacy, and to understand and change 
one’s self. Taken together, these approaches have 
at least two aspects in common: (a) they argue for a 
more holistic understanding of the human in HCI and 
(b) they seek to enrich HCI’s narrow view on IPQ 
with non-instrumental needs to complement the 
traditional, task-oriented approach. 

Although, the particular lists of needs differ from 
author to author, two broad categories — widely sup- 
ported by psychological research and theory — can 
be identified, namely competence/personal growth, 
for example, the desire to perfect one’s knowledge 
and skills, and relatedness/self-expression, for 
example, the desire to communicate a favorable 
identity to relevant others (see Hassenzahl, 2003). 

A sense of competence, for example, to take on 
and master hard challenges, is one of the core needs 
in Ryan and Deci’s (2000) self-determination 
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theory, which formulates antecedents of personal 
well-being . Similarly , Csikszentmihalyi ’ s ( 1 997) flow 
theory, which became especially popular in the 
context of analyzing Internet use (see Chen, Wigand, 
& Nilan, 1999; Novak, Hoffman, & Yung, 2000), 
suggests that individuals will experience a positive 
psychological state (flow) as long as the challenge 
such an activity poses is met by the individuals’ 
skills. Interactive products could tackle these chal- 
lenges by opening up for novel and creative uses 
while, at the same time, providing appropriate means 
to master these challenges. 

A second need identified by Ryan and Deci 
(2000) is relatedness — a sense of closeness with 
others. To experience relatedness requires social 
interaction and as Robinson (1993, cited in Leventhal, 
Teasley, Blumenthal, Instone, Stone, & Donskoy, 
1996) noted, products are inevitably statements in 
the on-going interaction with relevant others. A 
product can be understood as an extension of an 
individual’s self (Belk, 1988) — its possession and 
use serves self-expressive functions beyond the 
mere instrumental (e.g., Wicklund & Gollwitzer, 
1982). 

To summarize, an appealing interactive product 
may support needs beyond the mere instrumental. 
Needs that are likely to be important in the context 
of design and evaluation are competence/personal 
growth, which requires a balance between chal- 
lenge and ability and relatedness/self-expression, 
which requires a product to communicate favorable 
messages to relevant others. 



NEEDS BEYOND THE 
INSTRUMENTAL 

In this article, the terms instrumental and non- 
instrumental are used to distinguish between HCI’s 
traditional view on IPQ and newer additions. Re- 
peatedly, authors refer to instrumental aspects of 
products as utilitarian (e.g., Batra & Ahtola, 1990), 
functional (e.g., Kempf, 1999) or pragmatic (e.g., 
Hassenzahl, 2003), and to non-instrumental as he- 
donic. However, hedonic can have two different 
meanings: some authors understand it as the affec- 
tive quality (see section below) of a product, for 
example, pleasure, enjoyment, fun derived from 
possession or usage (e.g., Batra & Ahtola, 1990; 



Huang, 2003), while others see it as non-task related 
attributes, such as novelty or a product’s ability to 
evoke memories (e.g., Hassenzahl, 2003). Beside 
these slight differences in meaning, instrumental and 
non-instrumental aspects are mostly viewed as sepa- 
rate but complementing constructs. Studies, for ex- 
ample, showed instrumental as well as non-instru- 
mental aspects to be equally important predictors of 
product appeal (e.g., Hassenzahl, 2002a; Huang, 
2003). A noteworthy exception to the general notion 
of ideally addressing instrumental and non-instru- 
mental needs simultaneously is Gaver’ s et al. (2004b) 
concept of ludic products. According to them, a 
ludic product promotes curiosity, exploration and de- 
emphasizes the pursuit of external (instrumental) 
goals. Or as Gaver (personal communication) put it: 
Ludic products “ ... aren’t clearly useful, nor are 
they concerned with entertainment alone. Their 
usefulness is rather in prompting awareness and 
insight than in completing a given task.” Gaver et al. 
(2004b) argue, then, for a new product category 
aimed at solely supporting personal growth/com- 
petence by providing a context for new, challenging 
and intrinsically interesting experiences and by de- 
liberately turning the user’s focus away from func- 
tionality. 

A question closely related to instrumental and 
non-instrumental needs is their relative importance. 
Jordan (2000) argued for a hierarchical organization 
of needs (based on Maslow’s [1954] hierarchical 
concept of human needs): The first level is product 
functionality, the second level is usability and the 
third level is “pleasure,” which consists of his four 
non-instrumental aspects already presented earlier. 
Such a model assumes that the satisfaction of instru- 
mental needs is a necessary precondition for valuing 
non-instrumental needs. A product must, thus, pro- 
vide functionality, before, for example, being appre- 
ciated for its self-expressive quality. 

This strict assumption can be questioned. Souve- 
nirs, for example, are products, which satisfy a non- 
instrumental need (keeping a memory alive, see 
Hassenzahl, 2003) without providing functionality. 
However, for many products, functionality can be 
seen as a necessary precondition for acceptance. A 
mobile phone, for instance, which does not work will 
definitely fail on the market, regardless of its non- 
instrumental qualities. 
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One may, thus, understand a hierarchy as a par- 
ticular, context-dependent prioritization of needs 
(Sheldon, Elliott, Kim, & Kasser, 200 1 ). The relative 
importance of needs may vary with product catego- 
ries (e.g., consumers’ versus producers’ goods), 
individuals (e.g., early versus late adopters) or spe- 
cific usage situations. Hassenzahl, Kekez, and 
Burmester (2002), for example, found instrumental 
aspects of Web sites to be of value, only if participants 
were given explicit tasks to achieve. Instrumental 
aspects lost their importance for individuals with the 
instruction “to just have fun” with the Web site. 

EMOTIONS, AFFECT, AND 
EXPERIENCE 

Recently, the term emotional design (Norman, 2004) 
gained significant attention in the context of HCI. 
Many researchers and practitioners advocate the 
consideration of emotions in the design of interactive 
products — an interest probably triggered by science’ s 
general, newly aroused attention to emotions and 
their interplay with cognition (e.g., Damasio, 1994). 
In the context of HCI, Djajadiningrat, Overbeeke, 
and Wensveen (2000), for instance, argued for ex- 
plicitly taking both into account, knowing and feeling. 
Desmet and Hekkert (2002) went a step further by 
presenting an explicit model of product emotions 
based on Ortony, Clore, and Collins’ (1988) emotion 
theory. 

In general, emotions in design are treated in two 
ways: some authors stress their importance as con- 
sequences of product use (e.g., Desmet & Hekkert, 
2002, Hassenzahl, 2003; Kim & Moon, 1998; 
Tractinsky & Zmiri, in press), whereas others stress 
their importance as antecedents of product use and 
evaluative judgments (e.g., Singh & Dalai, 1999), 
visceral level in Norman (2004). 

The “Emotions as consequences” — perspective 
views particular emotions as the result of a cognitive 
appraisal process (see Scherer, 2003). Initial affec- 
tive reactions to objects, persons, or events are 
further elaborated by combining them with expecta- 
tions or other cognitive content. Surprise, for ex- 
ample, may be felt, if an event deviates from expec- 
tations. In the case of a positive deviation, surprise 
may then give way to joy. An important aspect of 
emotions is their situatedness. They are the result of 



the complex interplay of an individual’ s psychologi- 
cal state (e.g., expectations, moods, saturation level) 
and the situation (product and particular context of 
use). Slight differences in one of the elements can 
lead to a different emotion. Another important 
aspect is that emotions are transient. They occur, 
are felt, and last only a relatively short period of 
time. Nevertheless, they are an important element 
of experience. 

The ephemeral nature of emotions and the com- 
plexity of eliciting conditions may make it difficult to 
explicitly design them (Hassenzahl, 2004). Design- 
ers would need control over as many elements of an 
experience as possible. Good examples for environ- 
ments with a high level of control from the designer’ s 
perspective are theme parks or movies. In product 
design, however, control is not as high and, thus, 
designers may have to be content with creating the 
possibility of an emotional reaction, for example, the 
context for an experience rather than the experi- 
ence itself (Djajadiningrat et al., 2000; Wright, 
McCarthy, & Meekison, 2003). 

In 1980, Zajonc (1980) questioned the view of 
emotions as consequences of a cognitive appraisal. 
He showed that emotional reactions could be in- 
stantaneous, automatic without cognitive process- 
ing. And indeed, neurophysiology discovered a neu- 
ral shortcut that takes information from the senses 
directly to the part of the brain responsible for 
emotional reactions (amygdala) before higher order 
cognitive systems have had a chance to intervene 
(e.g., LeDoux, 1994). However, these instanta- 
neous emotional reactions differ from complex 
emotions like hate, love, disappointment, or satis- 
faction. They are more diffuse, mainly representing 
a good/bad feeling of various intensities about an 
object, person, or event. To distinguish this type of 
emotional reaction from the more complex dis- 
cussed earlier, they are often called affective reac- 
tions in contrast to emotions. Norman (2004) la- 
beled the immediate reaction “visceral” (bodily) as 
opposed to the more “reflective.” 

Importantly, one’s own immediate, unmediated 
affective reactions are often used as information 
( feelings-as-information , Schwarz & Clore, 1983), 
influencing and guiding future behavior. Damasio 
(1994) developed the notion of somatic markers 
attached to objects, persons, or events, which influ- 
ence the way we make choices by signaling “good” 
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or “bad”. Research on persuasion, for example, has 
identified two ways of information processing: sys- 
tematic (central) and heuristic (peripheral). Indi- 
viduals not capable or motivated to process argu- 
ment-related information, rely more strongly on pe- 
ripheral cues, such as their own immediate affective 
reactions towards an argument (e.g., Petty & 
Cacioppo, 1986). These results emphasize the im- 
portance of a careful consideration of immediate, 
product-driven emotional reactions for HCI. 



FUTURE TRENDS 

To design a “hedonic” interactive product requires 
an understanding of the link between designable 
product features (e.g., functionality, presentational 
and interactional style, content), resulting product 
attributes (e.g., simple, sober, exciting, friendly) and 
the fulfillment of particular needs. In the same vein 
as a particular user interface layout may imply 
simplicity, which in turn promises fulfillment of the 
need to achieve behavioral goals, additional at- 
tributes able to signal and promote fulfillment of 
competency or self-expression needs (and ways to 
create these) have to be identified. We may, then, 
witness the emergence of principles for designing 
hedonic products comparable to existing principles 
for designing usable products. 

As long as HCI strongly advocates a systematic, 
user-centered design process ( usability engineer- 
ing, e.g., Mayhew, 1999; Nielsen, 1993), tools and 
techniques will be developed to support the inclusion 
of non-instrumental needs and emotions. Some tech- 
niques have already emerged: measuring emotions 
in product development (e.g., Desmet, 2003), gath- 
ering holistic product perceptions (Hassenzahl, 
2002b), assessing the fulfillment of non-instrumental 
needs (e.g., Hassenzahl, in press) or eliciting non- 
instrumental, “inspirational” data (e.g., Gaver, 
Boucher, Pennington, & W alker, 2004a) . Others will 
surely follow. 



CONCLUSION 

Individuals have general needs, and products can 
play a role in their fulfillment. The actual fulfillment 



of needs (when attributed to the product) is per- 
ceived as quality. Certainly, individuals have instru- 
mental goals and functional requirements that a 
product may fulfill ; however, additional non-instru- 
mental, hedonic needs are important, too. Two needs 
seem to be of particular relevance: personal growth/ 
competence and self-expression/relatedness . 
Product attributes have to be identified, which signal 
and fulfill instrumental as well as non-instrumental 
needs. A beautiful product, for example, may be 
especially good for self-expression (Hassenzahl, in 
press; Tractinsky & Zmiri, in press); a product that 
balances simplicity /ease (usability) and novelty /stimu- 
lation may fulfill the need for personal growth. 

Human needs are important, and individuals can 
certainly reach general conclusions about their rela- 
tive importance (see Sheldon et al. , 200 1 ). However, 
quality is also rooted in the actual experience of a 
product. Experience consists of numerous elements 
(e.g., the product, the user’s psychological states, 
their goals, other individuals, etc.) and their interplay 
(see Wright et al., 2003). The complexity of an 
experience makes it a unique event — hard to repeat 
and even harder to create deliberately. But experi- 
ence nevertheless matters. Experiences are highly 
valued (Boven & Gilovich, 2003), and, consequen- 
tially, many products are now marketed as experi- 
ences rather than products (e.g., Schmitt, 1999). 
From an HCI perspective, it seems especially impor- 
tant to better understand experiences in the context 
of product use. 

Definitions of quality have an enormous impact 
on the success of interactive products. Addressing 
human needs as a whole and providing rich experi- 
ences would enhance the role of interactive prod- 
ucts in the future. 
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KEY TERMS 

Affect: An umbrella term used to refer to mood, 
emotion, and other processes, which address related 
phenomena. The present article more specifically 
uses the term “affective reaction” to distinguish an 
individual’s initial, spontaneous, undifferentiated, and 
largely physiologically-driven response to an event, 
person, or object from the more cognitively differen- 
tiated “emotion.” 



Emotion: A transient psychological state, such 
as joy, sadness, anger. Most emotions are the con- 
sequence of a cognitive appraisal process, which 
links an initial affective reaction (see “Affect” term 
definition) to momentarily available “information”, 
such as one’s expectations, beliefs, situational cues, 
other individuals, and so forth. 

Experience: A holistic account of a particular 
episode, which stretches over time, often with a 
definite beginning and ending. Examples of (posi- 
tive) experiences are: visiting a theme park or con- 
suming a bottle of wine. An experience consists of 
numerous elements (e.g., product, user’s psycho- 
logical states, beliefs, expectations, goals, other indi- 
viduals, etc.) and their relation. It is assumed that 
humans constantly monitor their internal, psycho- 
logical state. They are able to access their current 
state during an experience and to report it (i.e., 
experience sampling). Individuals are further able to 
form a summary, retrospective assessment of an 
experience. However, this retrospective assess- 
ment is not a one-to-one summary of everything that 
happened during the experience, but rather overem- 
phasizes single outstanding moments and the end of 
the experience. 

Instrumental Needs: Particular, momentarily 
relevant behavioral goals, such as making a tele- 
phone call, withdrawing money from one’s bank 
account, or ordering a book in an online shop. 
Product attributes related to the achievement of 
behavioral goals are often referred to as “utilitar- 
ian,” “pragmatic,” or “functional.” 

Non-Instrumental Needs: Go beyond the mere 
achievement of behavioral goals, such as self-ex- 
pression or personal growth. Product attributes re- 
lated to the fulfillment of non-instrumental needs are 
often referred to as “hedonic.” A more specific use 
of the term “hedonic” stresses the product’s “affec- 
tive” quality, for example, its ability to evoke positive 
affective reactions (mood, emotions, see “Affect” 
term definition). 
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INTRODUCTION 

Trend detection has been studied by researchers in 
many fields, such as statistics, economy, finance, 
information science, and computer science 
(Basseville & Nikiforov, 1993; Chen, 2004; Del 
Negro, 2001). Trend detection studies can be di- 
vided into two broad categories. At technical levels, 
the focus is on detecting and tracking emerging 
trends based on dedicated algorithms; at decision 
making and management levels, the focus is on the 
process in which algorithmically identified temporal 
patterns can be translated into elements of a decision 
making process. 

Much of the work is concentrated in the first 
category, primarily focusing on the efficiency and 
effectiveness from an algorithmic perspective. In 
contrast, relatively fewer studies in the literature 
have addressed the role of human perceptual and 
cognitive system in interpreting and utilizing 
algorithmically detected trends and changes in their 
own working environments. In particular, human 
factors have not been adequately taken into ac- 
count; trend detection and tracking, especially in text 
document processing and more recent emerging 
application areas, has not been studied as integral 
part of decision-making and related activities. How- 
ever, rapidly growing technology, and research in 
the field of human-computer interaction has opened 
vast and, certainly, thought-provoking possibilities 
for incorporating usability and heuristic design into 
the areas of trend detection and tracking. 



BACKGROUND 

In this section, we briefly review trend detection and 
its dependence on time and context, topic detection 
and tracking, supported by instances of their impact 
in diverse fields, and the emerging trend detection 
especially for text data. 

Trend Detection 

A trend is typically defined as a continuous change 
of a variable over a period of time, for example, 
unemployment numbers increase as the economy 
enters a cycle of recession. Trend detection, in 
general, and topic detection techniques are groups of 
algorithmic tools designated to identify significant 
changes of quantitative metrics of underlying phe- 
nomena. The goal of detection is to enable users to 
identify the presence of such trends based on a 
spectrum of monitored variables. The response time 
of a detection technique can be measured by the 
time duration of the available input data and the 
identifiable trend; it is dependent on specific applica- 
tion domains. For example, anti-terrorism and na- 
tional security may require highly responsive trend 
detection and change detection capabilities, whereas 
geological and astronomical applications require long- 
range detection tools. Other applications of this 
technology exist in the fields of business and medi- 
cine. 

Much research has been done in the field of 
information retrieval, automatically grouping (clus- 
tering) documents, performing automated text sum- 
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marization, and automatically labeling groups of 
documents. 

Policymakers and investigators are, obviously, 
eager to know if there are ways that can reliably 
predict each turn in the economy. Economists have 
developed a wide variety of techniques to detect and 
monitor changes in economic activities. The concept 
of business cycles is defined as fluctuations in the 
aggregate economic activities of a nation. A busi- 
ness cycle includes a period of expansion, followed 
by recessions, contractions, and revivals. Three 
important characteristics are used when identifying 
a recession: duration, depth, and diffusion — the 
three Ds. A recession has to be long enough, from a 
year to 10 years; a recession has to be bad enough, 
involving a substantial decline in output; and a reces- 
sion has to be broad enough, affecting several 
sectors of the economy. 

Topic Detection and Tracking 

Topic Detection and Tracking (TDT) is a sub-field 
primarily rooted in information retrieval. TDT aims 
to develop and evaluate technologies required to 
segment, detect, and track topical information in a 
stream consisting of news stories. TDT has five 
major task groups: (1) story segmentation, (2) topic 
detection, (3) topic tracking, (4) first story detection, 
and (5) story link detection. Topic detection focuses 
on discovering previously unseen topics, whereas 
topic tracking focuses on monitoring stories known 
to a TDT system. First story detection (FSD) aims 
to detect the first appearance of a new story in a time 
series of news associated with an event. Roy, Gevry , 
and Pottenger (2002) presented methodologies for 
trend detection. Kontostathis, Galitsky, Roy, 
Pottenger, and Phelps (2003) gave a comprehensive 
survey of emerging trend detection in textual data 
mining in terms of four distinct aspects: ( 1 ) input data 
and attributes, (2) learning algorithms, (3) visualiza- 
tion, and (4) evaluation. 

TDT projects typically test their systems on TDT 
data sets, which contain news stories and event 
descriptors. The assessment of the performance of 
a TDT algorithm is based on Relevance Judgment, 
which indicates the relevancy between a story and 
an event. Take the event descriptor Oklahoma City 
Bombing as an example. If a matching story is about 
survivors’ reaction after the bombing, the relevance 



judgment would be Yes. In contrast, the relevance 
judgment of the same story and a different event 
descriptor U.S. Terrorism Response would be No. 
Swan and Allan (1999) reported their work on 
extracting significant time varying features from 
text based on this type of data. 

An interesting observation of news stories is that 
events are often reported in burst. Yang, Pierce, and 
Carbonell (1998) depicted a daily histogram of story 
counts over time. News stories about the same event 
tend to appear within a very narrow time frame. The 
gap between two bursts can be used to discriminate 
distinct events. 

Kleinberg (2002) developed a burst detection 
algorithm and applied to the arrivals of e-mail and 
words used in titles of articles. Kleinberg was moti- 
vated by the need to filter his e-mail. He expected 
that whenever an important event occurs or is about 
to occur, there should be a sharp increase of certain 
words that characterize the event. He called such 
sharp increases bursts. Essentially, Kleinberg’ s burst 
detection algorithm analyzes the rate of increase of 
word frequencies and identifies the most rapidly 
growing words. He tested his algorithm on the full 
text of all the State of the Union addresses since 
1790. The burst detection algorithm identified impor- 
tant events occurring at the time of some of the 
speeches. For example, depression and recovery 
were bursty words in 1930-1937, fighting and 
Japanese were bursty in 1942-1945, and atomic 
was the buzz word in 1947 and 1959. 



EMERGING TREND DETECTION 
(ETD) 

ETD for Text Data 

Unlike financial and statistical data typically found in 
an economist’s trend detection portfolio, ETD in 
computer science often refers to the detection of 
trends in textual data, such as a collection of text 
documents and a stream of news feed. ETD takes a 
large collection of textual data as input and identifies 
topic areas that are previously unseen or are grow- 
ing in importance with in the corpus (Kontostathis et 
al., 2003). This type of data mining can be instrumen- 
tal in supporting the discovery of emerging trends 
within an industry and improving the understanding 
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of large volumes of information maintained by or- 
ganizations (Aldana, 2000). In the past few years, 
many companies have been storing their data elec- 
tronically. As the volume of data grows, it will hold 
information, which if analyzed in the form of trends 
and patterns, can be valuable to the company, pro- 
vided it is appropriately and accurately extracted. By 
using ETD, companies can extract the meaningful 
data and use it to gain a competitive advantage 
(Aldana, 2000). ETD provides a viable way to ana- 
lyze the evolution of a field. The problem switches 
from analyzing huge amounts of data to how to 
analyze huge amounts of data. 

The Role of HCI in ETD 



Once the data corpus is scanned, the user should 
be provided with feedback about the corpus. The 
user should be provided with information like the 
number of documents found, number of topics (or 
trends) found, number of new trends found, and 
other related information. For example, the system 
studied by Masao and Koiti (2000), produces an 
entity-relationship (ER) graph showing the relation 
of topics. This not only shows the user what new 
trends were found, but also shows how they are 
related. ETD systems should also support an adap- 
tive search mechanism. Users should have the 
option of providing keywords to search for emerg- 
ing trends in specific fields. 
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ETD systems are complicated to make and under- 
stand, thus there are many HCI issues that must be 
considered. First of all, the system should let the user 
define what an emerging trend is. In general, an 
emerging trend can be defined as a significant quan- 
titative growth over time. However, what counts as 
significant should not be entirely determined by com- 
puter algorithms. 

Many ETD algorithms have different threshold 
levels to define a topic as an emerging trend. Thus 
threshold levels should not be fixed and unchange- 
able for a system. Also, the user should be able to 
define what documents are in the data corpus. Addi- 
tionally, the algorithm should be hidden from the user. 
Ideally, the system would take its inputs and produce 
the outputs. When the user is given information, 
pertaining to inputs and outputs, sufficient amounts of 
user guidance should be provided. The design of an 
ideal user interface of a computer-based information 
system should be intuitive and self-explanatory. Us- 
ers should feel that they are in control and they can 
understand what is going on. Despite the technical 
complexity of an underlying algorithm, the user inter- 
face should clearly convey the functions to the user 
(Norman, 1998). 

Once a new trend is found, the system should 
include some mechanisms to define the essence of 
the new trend. A text summarization algorithm is a 
possible solution to this problem. Text summarization 
is capturing the essence of a data set (a single 
paragraph, document, or cluster) after reviewing the 
entire data set and producing output that describes 
the data set. 



APPLICATIONS 

Automatic trend detection has benefited a wide 
range of applications. An analyst will find emerging 
trend detection techniques useful in his area of 
work. The most generic application is to detect a 
new topic in a field and track its growth and use over 
time (Roy et al., 2002). Two examples are cited in 
the following sections. 

European Monitoring Center for Drugs 
and Drug Addiction (EMCDDA) 

The EMCDDA was disappointed when it realized 
that it failed to recognize the emerging trend in the 
use of the drug ecstasy. “. . .earlier identification of 
new drug consumption patterns would allow more 
time to assess the likely impact of such changes 
and, therefore, facilitate the earlier development of 
appropriate responses” (EMCDDA, 1999). With 
an effective trend detection system, agencies like 
the EMCDDA can prepare for and prevent the 
associated problems with a drug epidemic. How- 
ever, with the number of documents in some data- 
bases reaching over 1 00,000 a manual review of the 
data is impossible. 

XML 

The emergence of XML in the 1990s is shown in 
Figure 1 in terms of the growing number of articles 
published each year on the second-generation lan- 
guage of the World Wide Web. Market and field 
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Figure 1. The growth of the number of articles published on the topic of XML (Kontostathis et al., 
2003) 



The emergence of XML 




analysts will find such knowledge of an emerging 
trend particularly useful. For instance, a market- 
analyst watching a biotech firm will want to know 
about trends in the biotechnology field and how they 
affect companies in the field (Kontostathis et al., 
2003). 

Stock market analysts rely on patterns to observe 
market trends and make predictions. In general, the 
goal is to identify patterns from a corpus of data. In 
the past, analysts have relied on the human eye to 
discover these patterns. In the future, trend and 
topic tracking systems can take over this role, thus 
providing a more efficient and reliable method for 
stock market analysis. 



FUTURE TRENDS 

The future is promising for HCI concepts to be 
heavily embedded in the analysis and design of 
emerging trend detection (ETD) systems. Powerful 
data modeling techniques can make salient patterns 
clearer and in sharper contrast. Some of the major 
technical problems are how to make the changes 
over time easy to understand and how to preserve 
the overall context in which changes take place. 

ThemeRiver is a visualization system that uses 
the metaphor of a river to depict thematic flows over 
time in a collection of documents (Havre, Hetzler, 
Whitney, & Nowell, 2002). The thematic changes 



Table 1. Usability goals and how to apply them to ETD 



Usability Goal 


Definition 


ETD Application 


Learnability 


“How easy the system is to learn” 
(Rozanski & Haake, 2003) 


The system must be easy to learn for people 
from a wide variety of fields, including those 
with non-technical backgrounds. 


Efficiency 


“How quickly users can complete 
their tasks” (Rozanski & Haake, 
2003) 


The system should let the user focus on issues 
that are relevant to trend detection, without 
having to worry about issues with the system. 


Memorability 


“How easy the system is to 
remember” (Rozanski & Haake, 
2003) 


Users should not have to relearn the system 
each time they want to use it. 


Control of Errors 


Prevention and recovery from 
errors (Rozanski & Haake, 2003) 


The system design should make errors less 
likely to happen, and when they do happen, 
the system should help the user out of the 
errors. 


User Satisfaction 


“How much users like the system” 
(Rozanski & Haake, 2003) 


The users should be able to accomplish their 
goals without fmstration. 
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Table 2. Nielson ’s heuristics and application to ETD 
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Heuristic 


Application 


Visibility of system status 


While algorithms may take a while to process, there 
should be feedback so the user knows the progress of 
the system. (Nielson) 


Match between real world and system 


The system directions should be presented in 
language the user can understand; avoid complicated 
jargon and technical terms. (Nielson) 


User control and freedom 


The user should be able to set the various thresholds 
that go into defining topics and emerging trends. 
(Nielson) 


Consistency and standards 


Uniform color schemes and presentation of data are 
necessary. (Nielson) 


Error prevention 


Steps should be taken to prevent users from entering 
thresholds that do not work and starting processes 
without sufficient input. (Nielson) 


Recognition rather than recall 


Users should not have to remember long, 
complicated processes. The directions should be 
presented to them on the screen, or the setup should 
give the users clues on what to do next. (Nielson) 


Flexibility and efficiency of use 


The user should be able to easily change thresholds 
to compare results. There should be shortcuts 
available for more experiences users as well. 
(Nielson) 


Aesthetic and minimalist design 


The interface should be kept simple. (Nielson) 


Help users recognize, diagnose, and recover from 
errors 


Errors should be presented in a manner so that it does 
not look like regular data. (Nielson) 


Help and documentation 


Ample user manuals should be provided and should 
be presented in simple language. (Nielson) 



are shown along a time line of corresponding exter- 
nal events. A thematic river consists of frequency 
streams of terms; the changing width of a stream 
over time indicates the changes of term occur- 
rences. The occurrence of an external event may be 
followed by sudden changes of thematic strengths. 
On the one hand, searching for an abruptly widened 
thematic stream is a much more intuitive task to 
detect a new story than text-based TDT systems 
that can only report changes in terms of statistics. 

There are many things to keep in mind while 
developing an HCI-friendly ETD system. The basic 
usability goals can be used as a guideline to produc- 
ing a user-friendly ETD system. By striving to make 
a system learnable, efficient, memorable, keep er- 
rors under control, and give the user satisfaction 
from using the system, the foundation for an HCI 
friendly system is laid. Table 1 defines each of the 
usability goals and how they can be applied in an 
ETD system. 

A set of usability heuristics, proposed by Jakob 
Nielson (n.d.), can also pose as a good rule of thumb 
(see Table 2). 



Task analysis is a detailed description of an 
operator’s task, in terms of its components, to 
specify the detailed human activities involved, 
and their functional and temporal relationships 
(HCI Glossary, 2004). By having users describe 
their process, step-by-step designers can learn much 
about the user’s behavior. When conducting task 
analysis, have the users describe “the steps they 
would follow, the databases and tools they would 
use, and the decision points in the process” (Bartlett 
& Toms, 2003). 

CONCLUSION 

Emerging trend detection is a promising field that 
holds many applications. However, for ETD sys- 
tems to reach their full potential, they must be 
effective, easy to learn, easy to understand, and easy 
to use. A poorly-designed system will shun users 
away from this technology. It is important to remem- 
ber that ETD systems are interactive systems. An 
ETD system, that just takes a data corpus and scans 
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it, is not an effective one. Users must be able to 
define and experiment with thresholds, view feed- 
back about the data corpus, and be able to under- 
stand new trends. 
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KEY TERMS 

Burst Detection: The identification of sharp 
changes in a time series of values. Examples of 
bursts include the increasing use of certain words in 
association with given events. 



Paradigm shift is regarded as the key mechanism 
that drives science. The core of science is the 
domination of a paradigm. Paradigm shift is neces- 
sary for a scientific revolution, which is how science 
advances. 
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Topic Detection and Tracking: A sub-field of 
information retrieval. The goal is to detect the first 
appearance of text that differs from a body of 
previously processed text, or to monitor the behaviour 
of some identified themes over time. 



Information Visualization: A field of study 
aims to utilize human’s perceptual and cognitive 
abilities to enable and enhance our understanding of 
patterns and trends in complex and abstract infor- 
mation. Computer-generated 2- and 3-dimensional 
interactive graphical representations are among the 
most frequently used forms. 

Intellectual Turning Points: Scientific work 
that has fundamentally changed the subsequence 
development in its field. Identifying intellectual turn- 
ing points is one of the potentially beneficial areas of 
applications of trend detection techniques. 

Paradigm Shift: A widely known model in phi- 
losophy of science proposed by Thomas Kuhn. 



Trend: The continuous growth or decline of a 
variable over a period of time. 

Trend Detection: Using quantitative methods 
to identify the presence of a trend. A number of 
domain-specific criteria may apply to determine 
what qualifies as a trend, for example, in terms of 
duration, diversity, and intensity. Primary quality 
measures of trend detection include sensitivity and 
accuracy. 

Turning Point: A turning point marks the begin- 
ning or the end of a trend. For example, the point at 
which economy turns from recession to growth. 
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INTRODUCTION 

Conceptual modeling appears to be the heart of good 
software development (Jackson, 2000). The cre- 
ation of a conceptual model helps to understand the 
problem raised and represents the human-centered/ 
problem-oriented moment in the software process, 
as opposed to the computer-centered/software-ori- 
ented moment of the computational models (Blum, 
1996). The main objective of human computer inter- 
action (HCI) is also precisely to make human beings 
the focal point that technology should serve rather 
than the other way round. 

The conceptual models are built with conceptual 
modeling languages (CMLs), whose specification 
involves constructors and rules on how to combine 
these constructors into meaningful statements about 
the problem. 

Considering the criterion of the representation 
capability of the CMLs in software engineering, 
their main drawback is that they remain too close to 
the development aspects (Jackson, 1995). The con- 
structors are too much oriented toward the compu- 
tational solution of the problem, and therefore, the 
problem is modeled with implementation concepts 
(computer/software solution sensitivity) rather than 
concepts that are proper to human beings (human/ 
problem sensitivity) (Andrade, Ares, Garcia & 



Rodriguez, 2004). This stands in open opposition to 
what we have said about the moments in the soft- 
ware process and HCI. Moreover, this situation 
seriously complicates the essential validation of the 
achieved conceptual model, because it is drawn up 
in technical terms that are very difficult to under- 
stand by the person who faces the problem (Andrade 
et al., 2004). 

The semantics of the constructors determines 
the representation capability (Wand, Monarchi, Par- 
sons & Woo, 1995). Since the constructors are too 
close to implementation paradigms, the CMLs that 
currently are being used in software engineering are 
incapable of describing the problem accurately. 

Suitable human/problem-related theoretical guide- 
lines should determine which constructors must be 
included in a genuine CML. This article, subject to 
certain software-independent theoretical guidelines, 
proposes the conceptual elements that should be 
considered in the design of a real CML and, conse- 
quently, what constructors should be provided. 

The Background section presents the software- 
independent guidelines that were taken into account 
to identify the above-mentioned conceptual ele- 
ments. The Main Focus of the Article section dis- 
cusses the study that identified those elements. 
Finally, the Future Trends section presents the most 
interesting future trends, and the final section con- 
cludes. 



Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited. 



Human-Centered Conceptualization and Natural Language 



BACKGROUND 

In generic conceptualization, concepts are logically 
the primary elements. Despite their importance, the 
nature of concepts remains one of the toughest 
philosophical questions. However, this does not stop 
us from establishing some hypotheses about con- 
cepts (Dfez & Moulines, 1997): 

• HC1. Abstract Entities: Concepts are iden- 
tifiable abstract entities to which human beings 
have access, providing knowledge and guid- 
ance about the real world. 

• HC2. Contraposition of a System of Con- 
cepts with the Real World: Real objects can 
be identified and recognized thanks to the avail- 
able concepts. Several (real) objects are sub- 
sumed within one and the same (abstract) 
concept. 

• HC3. Connection Between a System of 
Concepts and a System of Language: The 

relationship of expression establishes a con- 
nection between concepts and expressions, 
and these (physical entities) can be used to 
identify concepts (abstract entities). 

• HC4. Expression of Concepts by Non- 
Sy ncategorematic Terms : Practic ally all non- 
sync ategorematic terms introduced by an ex- 
pert in a field express a concept. 

• HC5. Need for Set Theory: For many pur- 
poses, the actual concepts should be substi- 
tuted by the sets of subsumed objects to which 
set theory principles can be applied. 

Likewise, from a general viewpoint, any 
conceptualization can be defined formally as a triplet 
of the form (Concepts, Relationships, Functions) 
(Genesereth & Nilsson, 1986), which includes, re- 
spectively, the concepts that are presumed or hy- 
pothesized to exist in the world, the relationships (in 
the formal sense) among concepts, and the functions 
(also in the formal sense) defined on the concepts. 

This and the fact that natural language is the 
language par excellence for describing a problem 
(Chen, Thalheim & Wong, 1999) constitute the basis 
of our study. 



MAIN FOCUS OF THE ARTICLE 

It would certainly not be practical to structure a 
CML on the basis of the previous three formal 
elements, because (i) concepts are abstract entities 
(HC 1 ) ; (ii) relationships and functions are defined on 
the concepts, which increases the complexity; and 
(iii) people naturally express themselves in natural 
language (HC3: connection between a system of 
concepts and a system of language). 

Taking this and HC4 (expression of concepts by 
non-syncategorematic terms) into account, we pro- 
pose defining the CMLs on the basis of the concep- 
tual elements that result from the analysis of natural 
language. This procedure stems from the fact that 
there is a parallelism between natural language and 
the CML (Hoppenbrouwers, van der Vos & 
Hoppenbrouwers, 1997). 

From the analysis detailed in this section, we find 
that the identified conceptual elements actually can 
be matched to some of the three elements of the 
previous formal triplet; that is, the generic and 
formal definition is not overlooked. However, ulti- 
mately, a functional information taxonomy can be 
established, which is much more natural and practi- 
cal. 

Analyzing Natural Language 

Based on HC4, the conceptual elements were iden- 
tified by analyzing the non-syncategorematic cat- 
egories of nouns, adjectives, and verbs. Moreover, 
importance was also attached to adverbs, locutions, 
and other linguistic expressions, which, although 
many are syncategorematic terms, were considered 
relevant because of their conceptual load. 

Nouns 

Nouns can be divided into different groups accord- 
ing to different semantic traits. The most commonly 
used trait is the classification that determines whether 
the noun is common or proper. 

Considering this latter trait, we notice a parallel- 
ism between nouns and elements that are handled in 
any conceptualization: common nouns can lead to 
concepts or properties, and proper nouns can lead to 
property values. The following subsections consider 
these elements. 
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Concepts 

A concept can be defined as a mental structure, 
which, when applied to a problem, clarifies to the 
point of solving this problem. Here, this term is used 
in the sense of anything that is relevant in the problem 
domain and about which something is to be expressed 
by the involved individuals. 

Interpreted in this manner, relationships, actions, 
and many other elements actually are concepts. 
However, here, we only consider the concepts that 
are proper to the problem domain; that is, concept 
means anything that is, strictly speaking, proper to the 
problem, which may refer to concrete or abstract, or 
elementary or compound things. 

The concepts thus considered are included within 
C in the triplet (C, R, F), and bearing in mind HC2 
(contraposition of a system of concepts with the real 
world), a concept subsumes a set of objects that are 
specific occurrences of it. 

Properties 

A property is a characteristic of a concept or a 
relationship, as a relationship could be considered as 
a compound concept (we will address this conceptual 
element later). 

The set of all the properties of a concept/relation- 
ship describes the characteristics of the occurrences 
subsumed by this concept/relationship, each of which 
can be considered as functions or relationships of the 
triplet (C,R,F), depending on how the problem is to be 
conceptualized (Genesereth & Nilsson, 1986). 

Property Values 

The value(s) of a property is (are) selected from a 
range of values, which is the set of all the possible 
values of the property. 

Considering the triplet ( C,R,F ), if a property is 
conceptualized as a function, the possible values of 
the property are within C. If it is conceptualized as a 
relationship, the actual property really disappears 
and unary relationships are considered for each 
possible property value instead (Genesereth & Nilsson, 
1986). 



Adjectives 

The adjectives are used to determine the extent of 
the meaning of the noun (adjectival determiners) or 
to qualify the noun (adjectival modifiers). 

The adjectival determiners always accompany a 
noun and do not have semantic traits that alter the 
semantics of the noun phrase. Consequently, these 
elements do not have to be considered in a 
conceptualization. 

The adjectival modifiers can be descriptive or 
relational. The former refer to a property of the 
noun that they qualify. These adjectives are classed 
into different types according to their semantic trait 
(Miller, 1995): quality, size, type, and so forth. Only 
the type-related classification can lead to a new 
interpretation to be conceptualized — the relation- 
ship of generalization/specialization, which will be 
described next. 

Finally , the relational adj ecti val modifiers allude 
to the scope of the noun, and therefore, the above 
interpretation is also possible. 

Verbs 

The linguistic theory that we have used to analyze 
verbs is the Case Grammar Theory (Cook, 1989), 
which describes a natural language sentence in 
terms of a predicate and a series of arguments 
called cases (agent, object, etc.). 

This theory provides a semantic description of 
the verbs, which is the fundamental aspect to be 
considered here. It is precisely the semantic rela- 
tionship between the verb and its cases that is 
interpreted and modeled in conceptual modeling. 
This theory interprets the relationship between con- 
cepts; the verb alludes to the relationship and the 
cases to the related concepts. This obviously equates 
with R in the (C, R, F) triplet. 

Case Grammar theories establish a verb classi- 
fication, depending on the cases that accompany 
the verb. The semantics and, therefore, the 
conceptualization of the relationship differ, depend- 
ing on the type of verb that expresses the informa- 
tion. Nevertheless, these differences are always 
conceptual nuances of relationships. 
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Case Grammar theories do not establish just one 
verb classification that depends on the semantic 
nuances and the cases considered. We have used 
the classification established by Martinez (1998), 
because it places special emphasis on justifications 
based on natural language. Looking at this classifi- 
cation, we find that there are the following different 
types of relationships: 

1. Generalization/Specialization: This repre- 
sents the semantics “is a” in natural language, 
indicating the inclusion of a given concept in 
another more general one. This relationship 
takes the form of a hierarchy of concepts in 
which what is true for a set is also true for its 
subsets. In this respect, remember here HC5 
(need for set theory) and the importance of 
determining whether the sets (concepts) are 
disjoint and total — disjoint or overlap and total 
or partial relationship, respectively. 

2. Aggregation: This represents the natural lan- 
guage semantics “part of,” making it possible to 
form a concept from other concepts of which 
the former is composed. Aggregation is a Re- 
lationship that Has the Property of Transitivity . 
However, it does not always appear to be 
transitive (e.g., the hands are part of a me- 
chanic, and the mechanic is part of a company, 
but the hands are not part of the company). 
With the aim of resolving this paradox, several 
types of aggregations have been identified. 
There are two main types of aggregation, and 
the property of transitivity holds, provided that 
the aggregations are of the same type, although 
they do not always lead to intransitivity, even if 
they are different: 

a. Member/Collection: The parts or mem- 
bers are included in the collection because 
of their spatial proximity or social connec- 
tion. The parts of which the whole is 
composed are of the same type, and the 
whole does not change if one is removed. 

b. Compound/Component: The compo- 
nents perform a function or have a rela- 
tionship with respect to other components 
or the whole that they form. The parts are 
of different types, and the whole changes 
if a part is removed. 



c. Defined by the Meaning of the Verb: 

They are particular to each domain and 
generally differ from one domain to an- 
other. Therefore, a classification cannot 
be established as for the previous relation- 
ships, where the meaning remains un- 
changed in any domain. 
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Since a CML should be able to be used to 
represent reality as closely as possible and gather 
most of the semantics, it should provide different 
constructors for the different types of relationships. 
This distinction will make it possible to immediately 
assimilate all the underlying semantics for each type. 

Other Grammatical Categories 



There are linguistic structures that have not yet been 
analyzed and which actually introduce aspects to be 
conceptualized: adverbs, locutions, andotherlinguis- 
tic expressions that are very frequent in problem 
modeling. The conceptual elements that these struc- 
tures introduce are as follows: 

1. Constraints: No more than, like minimum, 
and so forth. 

2. Inferences and Calculations: Calculate, 
if... then..., deduce, and so forth. 

3. Sequence of Actions: First, second, after, 
and so forth. 

Constraints 



A constraint can be defined as a predicate whose 
values are true or false for a set of elements. 
Therefore, it can be viewed in the triplet ( C,R,F) as 
a function that constrains the possible values of 
these elements. 

Constraints are new elements to be conceptual- 
ized but which affect the elements already identi- 
fied. Indeed, constraints can be classified according 
to the element they affect — constraints on proper- 
ties, property values, or relationships. The first af- 
fect the actual properties, demanding compulsoriness 
or unicity. The second affect property values, re- 
stricting their possible values in the occurrences. 
The third restrict the occurrences that can partici- 
pate in a relationship. 
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Inferences and Calculations 

Considering the triplet ( C,R,F ), these elements can 
be placed within F. This is because they allude to the 
manipulation of known facts to output new facts. 

Considering these elements involve providing the 
language with constructors to conceptualize infer- 
ences, which indicate what to infer and what to use 
for this purpose, and calculations, which indicate 
how to calculate something using a mathematical or 
algorithmic expression. 

Sequence of Actions 

This element indicates what steps the human beings 
take and when to solve the problem. This means that 
the modeling language should include the construc- 
tors necessary to represent the following: 

• Decomposition into Steps: Human beings 
typically solve problems by breaking them down 
into steps. Logically, the non-decomposed steps 
should indicate exactly what function they have 
and how they are carried out (inferences and 
calculations). 

• Step Execution Order: Clearly, the order in 
which the identified steps are taken is just as 
essential as the previous. 

In the triplet ( C,R,F ), the actions or steps can be 
considered within C and their decomposition and 
order as relationships within R. For the latter type of 
relationships, constructors are needed to enable the 



Figure 1. Functional levels of information and 
their interrelationships 




bifurcations governed by the information known or 
derived from the domain. 

Establishing the Functional Information 
Taxonomy 

Depending on the function that they fulfill in the 
problem, all of the previously identified conceptual 
elements can be grouped into the following informa- 
tion levels (see Figure 1): 

• Strategic : It specifies what to do, when, and in 
what order. 

• Tactical: It specifies how to obtain new de- 
clarative information. 

• Declarative: It specifies the facts known about 
the problem. 

Figure 1 also shows the previously mentioned 
interrelationships between the different levels. The 
declarative level is managed by the other two levels, 
as it specifies what is used to decide on the alterna- 
tives of execution in the step sequence and on what 
basis the inferences and calculations are made. 
Moreover, the strategic level manages the tactical 
level, as it has to specify what inferences and 
calculations have to be made for each non-decom- 
posed step. 

In a CML that accounts for the presented ap- 
proach, constructors should be considered for the 
three submodels that jointly will conform a com- 
plete conceptual model: declarative, tactical and 
strategic. 

FUTURE TRENDS 

Based on the previous study, we have defined a 
graphic CML, detailed in Andrade (2002), with a 
view to getting optimum expressiveness and man- 
ageability. However, not all the information involved 
in a conceptualization can be detailed using a graphic 
notation. Any attempt to do so would complicate 
matters so much that it would relegate the benefits 
of the graphic capability to oblivion. For this reason, 
not all the model aspects are set out graphically. 
Thus, the previously mentioned CML was defined in 
the following way: 
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• Declarative Submodel: Constructs for con- 
cepts, relationships, and constraints on proper- 
ties are graphic. Constructs for properties and 
constraints on relationships are half-graphic. 
Property values and constraints on property 
values are expressed through textual constructs. 

• Tactical Submodel: Inferences are expressed 
through constructs that are half-graphic, 
whereas calculations are expressed through 
textual constructs. 

• Strategic Submodel: Constructs for step de- 
composition are graphic, whereas constructs 
for step execution order are half-graphic. 

To continue with the software process, research 
should focus now on identifying the criteria to (i) 
select the most suitable development paradigm(s) 
and (ii) map the conceptual constructions into the 
corresponding computational ones in that (those) 
selected paradigm(s). Moreover, a CASE tool to 
support the constructors defined and to facilitate the 
application of the previous criteria is really interest- 
ing. We are now working on both aspects. 

CONCLUSION 

The previously mentioned conceptual elements are 
considered in the conceptual modeling languages 
within software development today. However, (i) 
none of the languages considers them all; (ii) they 
are considered to represent certain technical devel- 
opment concepts; and (iii) as a result, the expres- 
siveness — semantics — of their constructors is lo- 
cated in the software solution (computer) domain 
and not in the problem (human) domain, as should be 
the case. 

The defined language has been applied to build 
the conceptual models in both software engineering 
(Andrade et al., 2004) and knowledge engineering 
(Ares & Pazos, 1998) conceptualization approaches, 
which demonstrates its generality and closeness to 
human beings and to the problem rather than to the 
software solution. 
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KEY TERMS 

Concept: A mental structure derived from ac- 
quired information, which, when applied to a prob- 
lem, clarifies to the point of solving this problem. 



Conceptual Model: An abstraction of the prob- 
lem as well as a possible model of a possible concep- 
tual solution to the problem. 

Conceptual Modeling: The use of concepts 
and their relationships to deal with and solve a 
problem. 

Conceptual Modeling Language: A language 
used to represent conceptual models. 

Human/Problem-Sensitivity: The proximity to 
the human-centered/problem-oriented concepts, as 
opposed to the computer-centered/software-oriented 
concepts (i.e., computer/software solution-sensitiv- 
ity). 

Natural Language: A language naturally spo- 
ken or written by human beings. 

Non-Syncategorematic Terms: These linguis- 
tic terms (also known as categorematic terms) are 
capable of being employed by themselves as terms 
as opposed to syncategorematic terms. 

Problem: A situation in which someone wants 
something and does not immediately know what 
actions to take to get it. 

Syncategorematic Terms: These linguistic 
terms cannot stand as the subject or the predicate of 
a proposition. They must be used in conj unction with 
other terms, as they have meaning only in such 
combination. 
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INTRODUCTION 

Historically, computer security has its roots in the 
military domain with its hierarchical structures and 
clear and normative rules that are expected to be 
obeyed (Adams & Sasse, 1999). The technical 
expertise necessary to administer most security 
tools stems back to the time where security was the 
matter of trained system administrators and expert 
users. A considerable amount of money and exper- 
tise is invested by companies and institutions to set 
up and maintain powerful security infrastructures. 
However, in many cases, it is the user’s behavior 
that enables security breaches rather than short- 
comings of the technology. This has led to the notion 
of the user as the weakest link in the chain (Schneier, 
2000), implying that the user was to blame instead of 
technology. The engineer’s attitude toward the fal- 
lible human and the ignorance of the fact that 
technology’s primary goal was to serve human 
turned out to be hard to overcome (Sasse, Brostoff, 
& Weirich, 2001). 

BACKGROUND 

With the spreading of online work and networked 
collaboration, the economic damage caused by se- 
curity-related problems has increased considerably 
(Sacha, Brostoff, & Sasse, 2000). Also, the increas- 
ing application of personal computers, personal net- 
works, and mobile devices with their support of 
individual security configuration can be seen as one 
reason for the increasing problems with security 
(e.g., virus attacks from personal notebooks, leaks in 
the network due to personal wireless LANs, etc.) 
(Kent, 1997). During the past decade, the security 
research community has begun to acknowledge the 
importance of the human factor and has started to 



take research on human-computer interaction into 
consideration. The attitude has changed from blam- 
ing the user as a source of error toward a more user- 
centered approach trying to persuade and convince 
the user that security is worth the effort (Ackerman, 
Cranor, & Reagle, 1999; Adams & Sasse, 1999; 
Markotten, 2002; Smetters & Grinter, 2002; Whitten 
& Tygar, 1999; Yee, 2002). 

In the following section, current research results 
concerning the implications of user attitude and 
compliance toward security systems are introduced 
and discussed. In the subsequent three sections, 
security-related issues from the main application 
areas, such as authentication, email security, and 
system security, are discussed. Before the conclud- 
ing remarks, an outlook on future challenges in the 
security of distributed context-aware computing 
environments is given. 

USER ATTITUDE 

The security of a system cannot be determined only 
by its technical aspects but also by the attitude of the 
users of such a system. Dourish et al. (2003) distin- 
guish between theoretical security (e.g., what is 
technologically possible) and effective security (e.g., 
what is practically achievable). Theoretical security 
to their terms can be considered as the upper bound 
of effective security. In order to improve effective 
security, the everyday usage of security has to be 
improved. In two field studies, Weirich and Sasse 
(2001) and Dourish et al. (2003) explored users’ 
attitudes to security in working practice. The find- 
ings of both studies can be summarized under the 
following categories: perception of security, percep- 
tion of threat, attitude toward security-related is- 
sues, and the social context of security. 
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Perception of security frequently is very inaccu- 
rate. Security mechanisms often are perceived as 
holistic tools that provide protection against threats, 
without any detailed knowledge about the actual 
scope. Therefore, specialized tools often are consid- 
ered as insufficient, as they do not offer general 
protection. On the other hand, people might feel 
protected by a tool that does not address the relevant 
issue and thus remain unprotected (e.g., firewall 
protects against e-mail virus). 

Perception of threats also reveals clear miscon- 
ceptions. None of the users asked considered them- 
selves as really endangered by attacks. As potential 
victims, other persons in their organization or other 
organizations were identified, such as leading per- 
sonnel, people with important information, or high- 
profile institutions. Only a few of them realized the 
fact that they, even though not being the target, could 
be used as a stepping stone for an attack. The 
general attitude was that no one could do anything 
with the information on my computer or with my e- 
mails. 

Potential attackers mainly were expected to be 
hackers or computer kids, with no explicit malevo- 
lent intentions but rather seeking fun. Notorious and 
disturbing but not really dangerous offenders, such 
as vandals, spammers, and marketers, were per- 
ceived as a frequent threat, while on the other hand, 
substantially dangerous attackers such as criminals 
were expected mainly in the context of online bank- 
ing. 

The attitude toward security technology was 
rather reserved. Generally, several studies reported 
three major types of attitudes toward security: pri- 
vacy fundamentalists, privacy pragmatists, and pri- 
vacy unconcerned (Ackerman et al., 1999). Users’ 
experiences played a considerable role in their atti- 
tude, as experienced users more often considered 
security as a hindrance and tried to circumvent it in 
a pragmatic fashion in order to reach their work 
objectives. Weirich and Sasse (2001) report that 
none of the users absolutely obeyed the prescribed 
rules, but all were convinced that they would do the 
best they could for security. 

Additionally, users’ individual practices are often 
in disagreement with security technology. People 
use legal statements in e-mail footers or cryptic e- 
mails, not giving explicit information but using con- 
textual cues instead. In conjunction with such sub- 



sidiary methods and the fact that people often seem 
to switch to the telephone when talking about impor- 
tant things (Grinter & Palen, 2002) indicates the poor 
perception users have of security technology. 

The feeling of futility was reported with respect 
to the need for constantly upgrading security mecha- 
nisms in a rather evolutionary struggle (i.e., if some- 
body really wants to break in, he or she will). As a 
result, personal accountability was not too high, as 
users believed that in a situation where someone 
misused his or her account, personal credibility 
would weigh more than computer-generated evi- 
dence, in spite of the fact that the fallibility of 
passwords is generally agreed. 

The social context has been reported to play an 
important role in day-by-day security, as users are 
not permanently vigilant and aware of possible threats 
but rather considered with getting their work done. 
Therefore, it is no wonder that users try to delegate 
responsibility to technical systems (encryption, 
firewalls, etc.), colleagues and friends (the friend as 
expert), an organization (they know what they do), 
or institutions (the bank cares for secure transfers). 
Most people have a strong belief in the security of 
their company’s infrastructure. Delegation brings 
security out of the focus of the user and results in 
security unawareness, as security is not a part of the 
working procedure anymore. 

Whenever no clear guidelines are available, people 
often base their practice on the judgments of others, 
making the system vulnerable to social engineering 
methods (Mitnick, Simon, & Wozniak, 2002). In 
some cases, collaboration appears to make it neces- 
sary or socially opportune to disclose one’s pass- 
word to others for practical reasons, technical rea- 
sons, or as a consequence of social behavior, since 
sharing a secret can be interpreted as a sign of trust. 
Such sharing is a significant problem, as it is used in 
social engineering in order to obtain passwords and 
to gain access to systems. 

Dourish et al. (2003) came to the conclusion that 
“where security research has typically focused on 
theoretical and technical capabilities and opportuni- 
ties, for end users carrying out their work on com- 
puter systems, the problems are more prosaic” (p. 
12). The authors make the following recommenda- 
tions for the improvement of security mechanisms in 
the system and in the organizational context: 
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• Users should be able to access security settings 
easily and as an integral part of the actions, not 
in the separated fashion as it is today; therefore, 
security issues should be integrated in the de- 
velopment of applications (Brostoff & Sasse, 
2001; Gerd tom Markotten, 2002). 

• It is necessary that people can monitor and 
understand the potential consequences of their 
actions (Irvine & Levin, 2000) and that they 
understand the security mechanisms employed 
by the organization. 

• Security should be embedded into working prac- 
tice and organizational arrangement, and visible 
and accessible in everyday physical and social 
environment (Ackerman & Cranor, 1999). 

• Security should be part of the positive values in 
an organization. So-called social marketing could 
be used to establish a security culture in a 
company. 

• The personal responsibility and the danger of 
personal embarrassment could increase the feel- 
ing of personal liability. 

• The importance of security-aware acting should 
be made clear by emphasizing the relevance to 
the organization’ s reputation and financial dan- 
gers. 

As has been shown, the design and implementa- 
tion of security mechanisms are closely interlinked to 
the psychological and sociological aspects of the 
user’s attitude and compliance toward the system. 
Any security system is in danger of becoming ineffi- 
cient or even obsolete if it fails to provide adequate 
support and motivate users for its proper usage. The 
following sections discuss these findings in the con- 
text of the main application domains of computer 
security. 



AUTHENTICATION 

Information technology extends our ability to com- 
municate, to store and retrieve information, and to 
process information. With this technology comes the 
need to control access to its applications for reasons 
of privacy and confidentiality, national security, or 
auditing and billing, to name a few. Access control in 
an IT system typically involves the identification of a 
subject, his or her subsequent authentication, and, 



upon success, his or her authorization to the IT 
system. 

The crucial authentication step generally is car- 
ried out based on something the subject knows, has, 
or is. By far the most widespread means of authen- 
tication is based on what a subject has (e.g., a key). 
Keys unlock doors and provide access to cars, 
apartments, and contents of a chest in the attic. 
Keys are genuinely usable — four-year-olds can 
handle them. In the world of IT, something the 
subject knows (e.g., a password or a secret per- 
sonal identification number [PIN] ) is the prominent 
mechanism. 

The exclusiveness of access to an IT system 
protected by a password rests on the security of the 
password against guessing, leaving aside other tech- 
nical means by which it may or may not be broken. 
From an information theoretic standpoint, a uni- 
formly and randomly chosen sequence of letters 
and other symbols principally provides the greatest 
security. However, such a random sequence of 
unrelated symbols also is hard to remember, a 
relation that is rooted in the limitation of humans’ 
cognitive capabilities. 

As a remedy, a variety of strategies were in- 
vented to construct passwords that humans can 
memorize more easily without substantially sacri- 
ficing the security of a password (e.g., passwords 
based on mnemonic phrases). For instance, Yan, et 
al. (2000) conducted a study with 400 students on 
the effect of three forms of advice on choosing 
passwords. They found, for example, that pass- 
words based on pass phrases were remembered as 
easily as naively chosen passwords, while being as 
hard to crack as randomly chosen passwords. In- 
sight into human cognition also has led to the 
investigation of alternatives such as cognitive pass- 
words (Zviran & Haga, 1990), word associations 
(Smith, 1987), pass phrases (Spector & Ginzberg, 
1994), images (Dhamija & Perrig, 2000), or pass 
faces (Brostoff & Sasse, 2000). 

Authentication in public places, as is the case 
with automatic teller machines (ATM), has turned 
out to be vulnerable to attacks, where criminals 
obtain a user’s PIN by using cameras or other 
methods of observation in so-called shoulder-surf- 
ing attacks (Colville, 2003). In order to obscure the 
numbers entered by the user and thus hamper the 
recording of the necessary PIN, several techniques 
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have been proposed (Hopper & Blum, 200 1 ; W ilfong, 
1997). Recently, Roth, et al. (2004) suggested vari- 
ants of cognitive trapdoor games to protect users 
against shoulder-surfing attacks. In this approach, 
the buttons on a PIN pad are colored either black or 
white, and the user has to decide whether the 
number to be entered is in the black or white group. 
As the colors are changing randomly, the user has to 
enter the same number three to four times to com- 
plete an input. By blurring the response set with so- 
called shadows, this method can be made resistant 
against camera attacks. Even though this approach 
is slightly more complicated than the classical ap- 
proach, this technique has proven to be accepted by 
the user in an experimental setting. 

E-MAIL SECURITY 

Before the middle of the 1970s, cryptography was 
built entirely on symmetric ciphers. This meant that 
in order for enciphered communication to take place, 
a secret key needed to be exchanged beforehand 
over a secure out-of-band channel. One way of 
doing that was to send a trusted courier to the party 
with whom one intended to communicate securely. 
This procedure addressed two important issues: the 
secret key exchange and the implicit authentication 
of the exchanged keys. Once established, the keys 
could be used to secure communication against 
passive and active attacks until the key was ex- 
pected to become or became compromised. 

When asymmetric cryptography (Diffie & 
Heilman, 1976; Rivest, Shamir, & Adleman, 1978) 
was invented in the 1970s, it tremendously simplified 
that task of key exchange, and gave birth to the 
concept of digital signatures. Asymmetric cryptog- 
raphy did not solve the problem of authenticating 
keys per se. Although we now can exchange keys 
securely in the clear, how could one be certain that 
a key actually belonged to the alleged sender? 
Toward a solution to this problem, Loren Kohnfelder 
(1978) invented the public key certificate, which is a 
public key and an identity, signed together in a clever 
way with the private key of a key introducer whom 
the communicating parties need to trust. This idea 
gave rise to the notion of a public key infrastructure 
(PKI). Some existing models of public key infra- 
structures are the OpenPGP Web of Trust model 



(RFC 2440) and the increasingly complex ITU Rec- 
ommendation X.509-based PKIX model (RFC 3280) 
(Davis, 1996; Ellison, 1996, 1997; Ellison &Schneier, 
2000 ). 

In applications such as electronic mail, building 
trust in certificates, exchanging keys, and managing 
keys account for the majority of the interactions and 
decisions that interfere with the goal-oriented tasks 
of a user and that the user has difficulty understand- 
ing (Davis, 1996; Gutmann, 2003; Whitten & Tygar, 
1999). At the same time, the majority of users only 
has limited understanding of the underlying trust 
models and concepts (Davis, 1996) and a weak 
perception of threats (see previous discussion). 
Consequently, they avoid or improperly operate the 
security software (Whitten & Tygar, 1999). 

In the safe staging approach, security functions 
may be grouped into stages of increasing complex- 
ity. A user may begin at a low stage and progress to 
a higher stage once the user understands and mas- 
ters the security functions at the lower stages. The 
safe-staging approach was proposed by Whitten & 
Tygar (2003), who also pioneered research on the 
usability of mail security by analyzing users’ perfor- 
mances when operating PGP (Whitten & Tygar, 
1999). 

SYSTEM SECURITY 

Computer systems progressed from single user sys- 
tems and multi-user batch processing systems to 
multi-user time-sharing systems, which brought the 
requirement to sharing the system resources and at 
the same time to tightly control the resource alloca- 
tion as well as the information flow within the 
system. The principal approach to solving this is to 
establish a verified supervisor software also called 
the reference monitor (Anderson, 1972), which con- 
trols all security-relevant aspects in the system. 

However, the Internet tremendously accelerated 
the production and distribution of software, some of 
which may be of dubious origin. Additionally, the 
increasing amounts of so-called malware that thrives 
on security flaws and programming errors lead to a 
situation where the granularity of access control in 
multi-user resource-sharing systems is no longer 
sufficient to cope with the imminent threats. Rather 
than separating user domains, applications them- 
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selves increasingly must be separated, even if they 
run on behalf of the same user. A flaw in a Web 
browser should not lead to a potential compromise of 
other applications and application data such as the 
user’s e-mail client or word processor. Despite 
efforts to provide solutions to such problems 
(Goldberg et al., 1996) as well as the availability of 
off-the-shelf environments in different flavors of 
Unix, fine-grained application separation has not yet 
been included as a standard feature of a COTS 
operating system. 

Even if such separation were available, malicious 
software may delude the user into believing, for 
example, that a graphical user interface (GUI) com- 
ponent of the malware belongs to a different trusted 
application. One means of achieving this is to mimic 
the visual appearance and responses of the genuine 
application. One typical example would be a fake 
login screen or window. Assurance that a certain 
GUI component actually belongs to a particular 
application or the operating system component re- 
quires a trusted path between the user and the 
system. For instance, a secure attention key that 
cannot be intercepted by the malware may switch to 
a secure login window. While this functionally is 
available in some COTS operating systems, current 
GUIs still provide ample opportunity for disguise, a 
problem that also is eminent on the Web (Felten, 
Balfanz, Dean, & Wallach, 1997). One approach to 
solving this problem for GUIs is to appropriately 
mark windows so that they can be associated with 
their parent application (Yee, 2002). One instance of 
a research prototype windowing system designed 
with such threats in mind is the EROS Trusted 
Window System (Shapiro, Vanderburgh, Northup, 
& Chizmadia, 2003). 



FUTURE TRENDS 

Mobile computing and the emergence of context- 
aware services progressively are integrating into 
new and powerful services that hold the promise of 
making life easier and safer. Contextual data will 
help the user to configure and select the services the 
user needs and even might elicit support proactively. 
Far beyond that, Ambient Intelligence (1ST Advi- 
sory Group, 2003) is an emerging vision of dynami- 
cally communicating and cooperating appliances 



and devices in order to provide an intelligent sur- 
rounding for tomorrow’s citizens. Radio frequency 
identification transmitters (RFID) already have been 
discussed with respect to their implications on pri- 
vacy (Weis, 2004). Certainly, one person’s contex- 
tual awareness is another person’s lack of privacy 
(Hudson & Smith, 1996). In the future, the develop- 
ment of powerful and usable security concepts for 
applying personal information to the context and vice 
versa is one of the greatest challenges for today’s 
security engineers and human-computer interaction 
research (Ackerman, Darell, & Weitzner, 2001). To 
accomplish this task seems crucial for future accep- 
tance and for chances of such technologies without 
them becoming a “privacy Chernobyl” (Agre, 1999). 
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CONCLUSION 

The view of the user as the weakest link and 
potential security danger finally has turned out to be 
an obsolescent model. Security engineers and per- 
haps, more importantly, those people who are re- 
sponsible for IT security have noticed that working 
against the user will not do, and instead, they have 
decided to work with and for the user. During the 
past years, an increasing number of research has 
focused on the issue of making security usable, 
addressing the traditional fields of authentication, 
communication, and e-mail and system security. 
This article has given a brief overview of some of the 
work done so far. In order to make information 
technology more secure, the user is the central 
instance. The user must be able to properly use the 
security mechanisms provided. Therefore, under- 
standing users’ needs and identifying the reasons 
that technology fails to convince users to employ it 
is crucial. The first part of this article summarized 
some work done by Dourish, Weirich, and Sasse that 
provided important insights. But much work still has 
to be done. 

Future technology will build even more on the 
integration and sharing of heterogeneous sources of 
information and services. The tendency toward dis- 
tributed and location-based information infrastruc- 
tures will lead to new security problems. Feeling 
safe is an important aspect of acceptance. The 
success of tomorrow’s systems also will depend on 
the user’ s feeling safe while sharing information and 
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using services, which has already been shown during 
the first stage of e-commerce. Therefore, making 
security usable is an important aspect of making 
security safe. 
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KEY TERMS 

Asymmetric Cryptography: A data encryption 
system that uses two separate but related encryption 
keys. The private key is known only to its owner, 
while the public key is made available in a key 
repository or as part of a digital certificate. Asym- 
metric cryptography is the basis of digital signature 
systems. 

Public Key Infrastructure (PKI): The public 
infrastructure that administers, distributes, and cer- 
tifies electronic keys and certificates that are used to 
authenticate identity and encrypt information. Gen- 
erally speaking, PKI is a system of digital certifi- 
cates, certification authorities, and registration au- 
thorities that authenticate and verify the validity of 
the parties involved in electronic transactions. 

Shoulder Surfing: The practice of observing 
persons while entering secret authentication infor- 
mation in order to obtain illegal access to money or 



services. This often occurs in the context of PIN 
numbers and banking transactions, where shoulder 
surfing occurs together with the stealthy duplication 
of credit or banking cards. 

Social Engineering: The technique of exploit- 
ing the weakness of users rather than software by 
convincing users to disclose secrets or passwords by 
pretending to be authorized staff, network adminis- 
trator, or the like. 

Spoofing: The technique of obtaining or mimick- 
ing a fake identity in the network. This can be used 
for pretending to be a trustworthy Web site and for 
motivating users (e.g., entering banking informa- 
tion), pretending to be an authorized instance that 
requests the user’s password, or making users ac- 
cept information that is believed to come from a 
trusted instance. 

Types of Authentication: Authentication gen- 
erally can be based on three types of information: by 
some thing the user has (e.g., bank card, key, etc.), 
by something the user knows (e.g., password, num- 
ber, etc.), or by something the user is (e.g., biometric 
methods like fingerprints, face recognition, etc.). 
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INTRODUCTION 

A significant fraction of our society suffers from 
different types of physical and cognitive challenges. 
The seriousness of the problem can be gauged from 
the fact that approximately 54 million Americans are 
classified as disabled (Ross, 2001). In addition, 
approximately 7% of all school-age children experi- 
ence moderate to severe difficulties in the compre- 
hension and production of language due to cognitive, 
emotional, neurological, or social impairments (Evans, 
2001). The problems faced by this community are 
diverse and might not be comprehended by their 
able-bodied counterparts. These people can become 
productive and independent, if aids and devices that 
facilitate mobility, communication, and activities of 
daily living can be designed. 

Researchers in the human-computer interaction 
and rehabilitation engineering communities have 
made significant contributions in alleviating the prob- 
lems of the physically challenged. The technology, 
that assists the physically challenged to lead a nor- 
mal life is termed assistive technology. This article 
dwells on different aspects of assistive technology 
that have found application in real life. 

One of the important approaches to assistive 
technology is the use of iconic environments that 
have proved their efficacy in dealing with some of 
the communication problems of the physically chal- 
lenged. The second part of the article discusses the 
issues and methods of applying iconic interfaces to 
assist the communication needs of the physically 
challenged. 

BACKGROUND 

The problems faced by the disabled section of our 
society are huge and of a diverse nature. Disabilities 



can be classified into physical or cognitive disabili- 
ties. Physical disabilities like restricted mobility and 
loss of hearing, speaking, or visual acuity severely 
affect the normal life of some people. People suffer- 
ing from such handicaps need the help of an assistant 
to help them to perform their routine activities and to 
use standard appliances. 

The case of cognitively challenged people is even 
more serious. Their difficulties can range from 
deficits in vocabulary and word-finding to impair- 
ments in morphology, phonology, syntax, pragmat- 
ics, and memory (Evans, 2001). Persons suffering 
from autism show delay in language development; 
complete absence of spoken language; stereotyped, 
idiosyncratic, or repetitive use of language; or an 
inability to sustain a conversation with others. The 
problems faced by a dyslexic person can range from 
disabilities affecting spelling, number and letter rec- 
ognition, punctuation problems, letter reversals, word 
recognition, and fixation problems (Gregor et al., 
2000). Brain impairments can lead to learning, atten- 
tion span, problem-solving, and language disorders 
(Rizzo et al., 2004). 

Difficulties in using a standard computer stem 
from problems like finding command button prompts, 
operating a mouse, operating word processing, and 
finding prompts and information in complex displays. 
The complexity of a GUI display and the desktop 
metaphor creates severe problems (Polletal., 1995). 
In the case of motor-impaired subjects, the rate of 
input is often noisy and extremely slow. 

To solve these problems, which are inherently 
multi-disciplinary and non-trivial, researchers from 
different branches have come together and inte- 
grated their efforts. Assistive technology, therefore, 
is a multidisciplinary field and has integrated re- 
searchers from seemingly disparate interests like 
neuroscience, physiology, psychology, engineering, 
computer science, rehabilitation, and other technical 
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and health-care disciplines. It aims at reaching an 
optimum mental, physical, and/or functional level 
(United Nations 1983). In the following, we look at 
the methodology adopted in this field in order to solve 
the aforementioned problems. 

AN OVERVIEW OF ASSISTIVE 
DEVICES 

Solving the whole gamut of problems faced by this 
community requires the construction of what are 
called smart houses. Smart houses are used by old 
people and people with disabilities. An extensive 
review of the issues concerning smart houses ap- 
pears in Stefanov, et al. (2004). Broadly speaking, 
these houses contain a group of equipment that 
caters to the different needs of the inhabitants. The 
technology installed in these homes should be able to 
adapt to each person’s needs and habits. 

However, the construction of this technology is 
not a simple task. Assistive and Augmentative Com- 
munication (AAC) aims to use computers to simplify 
and quicken the means of interaction between the 
disabled community and their able-bodied counter- 
parts. The tasks of AAC, therefore, can be seen as 
facilitating the interaction with the world and the use 
of computers. Interaction with the world is facili- 
tated by one of the following methods: 

• Design of robotic systems for assistance. 

• Design of systems that help in indoor naviga- 
tion, such as smart wheelchairs. 

• Devices that convert normal speech to alpha- 
betic or sign language (Waldron et al., 1995). 

• Devices that convert sign language gestures 
into voice-synthesized speech, computer text, 
or electronic signals. 

• Design of special software-like screen readers 
and text-to-speech systems for the blind popu- 
lation (Burger, 1994). 

For physically disabled people, researchers have 
designed motorized wheelchairs that are capable of 
traversing uneven terrains and circumventing ob- 
stacles (Wellman et al., 1995). Robots have been 
used to assist users with their mundane tasks. Stud- 
ies have shown that task priorities of users demand 
a mobile device capable of working in diverse and 



unfamiliar environments (Stangeretal., 1994). Smart 
wheelchairs solve some of these requirements. They 
are capable of avoiding obstacles and can operate in 
multiple modes, which can be identified as following 
a particular strategy of navigation (Levine et al., 
1999). 

The problem is quite different in the case of the 
visually disabled population. Visual data are rich and 
easily interpreted. Therefore, to encode visual data 
to any other form is not trivial. Haptic interface 
technology seeks to fill this gap by making digital 
information tangible. However, haptic interfaces 
are not as rich as visual interfaces in dissemination 
of information. To make these haptic environments 
richer and, hence, more useful, methods like speech 
output, friction, and texture have been added to 
highlight different variations in data, such as color 
(Fritz et al., 1999). Braille was devised in order for 
blind people to read and write words. Letters can be 
represented through tactile menus, auditory pat- 
terns, and speech in order to identify them. 

As far as assistance for navigation in physical 
terrains is considered, visually impaired people can 
use dogs and canes to prevent obstacles. However, 
it is clear that these options are limited in many 
senses. For example, these might not help these 
people in avoiding higher obstacles like tree branches. 
In Voth (2004), the author explains the working of a 
low-cost, wearable vision aid that alerts its user of 
stationary objects. The technology is based on the 
observation that objects in the foreground reflect 
more light than those that are not. The luminance of 
different objects is tracked over several frames, and 
the relative luminance is compared in order to iden- 
tify objects that are coming closer. The software 
informs the user when the object comes too close 
(i.e., at an arm’s length), and a warning icon is 
displayed onto a mirror in front of the eyes of the 
user. 

To help these people to use computers more 
effectively, three of the following types of problems 
must be handled: 

• It should be noted that this population might be 
unable to provide input in the required form. 
This inability becomes critical when physical or 
cognitive challenges seriously inhibit the move- 
ment of limbs. This entails the design of spe- 
cial-purpose access mechanisms for such 
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people. This leads to the design of hardware 
plug-ins, which can provide an interface be- 
tween the subject and the computer. Scanning 
emulators have been used to help them to inter- 
act with the computer. Instruments like a head- 
operated joystick that uses infrared LEDs and 
photo detectors also have been designed (Evans 
et al., 2000). 

• Cognitive impairments lead to slow input. To 
ensure that the communication runs at a practi- 
cal speed, it is mandatory that we design strat- 
egies to quicken the rate of input. Two strate- 
gies are used to increase the rate of input. One 
of them is to work with partial input and infer its 
remaining constituents. This works because our 
language has many redundant and predictable 
constituents. Given a model of language, we 
can work with truncated and incomplete input. 
The second strategy, which increases the rate 
of the input, is adaptive display techniques. This, 
too, depends upon a model of the domain. This 
model is used to predict the next most likely 
input of the user, and the system adapts itself to 
make sure that the user makes less effort to 
choose this input. The same philosophy works in 
the case of toolkits like word prediction soft- 
ware. 

• Language and cognitive disabilities also lead to 
noise in communication. Therefore, the meth- 
ods used should be robust and fault-tolerant. 

We now discuss some of the systems that imple- 
ment the aforementioned ideas. Reactive keyboards 
(Darragh et al., 1990) aim to partially automate and 
accelerate communication by predicting the next 
word the user is going to type. This requires adaptive 
modeling of the task, based on the previously entered 
text or a language model. However, there is evidence 
that the cognitive load accrued in searching among 
the predicted list outweighs the keystroke savings 
(Koester et al., 1994). Prediction of texts also can be 
augmented with the help of a semantic network 
(Stocky et al., 2004). Signing, finger spelling and lip 
reading are used for communicating with the deaf. 
People also have attempted sign language communi- 
cation through telephone lines (Manoranjan et al., 
2000). Numerous applications, such as translation of 
word documents to Braille, have been done 
(Blenkhorn et al., 2001). 



The most difficult problems facing the human- 
computer interaction community is to find appropri- 
ate solutions to problems faced by cognitively chal- 
lenged people. For solutions to problems like autism 
and dyslexia, a detailed model of the brain deficien- 
cies need to be known. Some efforts have been 
made to solve these problems. The case in which 
rapid speech resulted in problems of comprehen- 
sion was dealt with in Nagarajan et al. (1998). 
Different speech modulation algorithms have been 
designed, which have proved to be effective for this 
population. Attempts have been made to design 
word processors for dyslexic people (Gregor et al., 
2000 ). 

The case of autistic people is similar. Generally, 
autistic conversation is considered as disconnected 
or unordered. Discourse strategies have been used 
to find out the patterns of problems in such people. 
These studies have concentrated on finding the 
typical features of conversation with an autistic 
patient. These include length and complexity of 
structure; categories of reference; ellipsis; and 
phonological, syntactic, and semantic interrelation- 
ships. Studies also have shown that autistic people 
have a tendency to turn to earlier topics within a 
conversation and turn to favored topics cross- 
conversationally. Solutions to the problems faced 
by these communities are important research ar- 
eas. To diagnose their deficiencies and to use 
models of man-machine dialogue in order to com- 
prehend and support the dialogue is a non-trivial and 
challenging task. 

Another reason for difficulty in using computers 
stems from computer naivety, which can lead to 
problems in using and dealing with the desktop 
metaphor. This can lead to poor understanding of 
the knowledge of the keys, poor knowledge of the 
way in which applications work, and poor acquisi- 
tion of the conceptual model for mouse-cursor 
position. Dealing with mouse pointers becomes 
even more cumbersome when the vision is imper- 
fect. The mouse pointer may get lost in the complex 
background or be beyond the desktop boundaries 
without the user realizing it. The problem reaches 
epic proportions when small screens are used. 
Mouse pointer magnification and auditory signals 
can be used to keep the user aware of the precise 
location of the pointer. The software also can be 
configured to help the user ask for help in cases 
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when the pointer cannot be found. To adapt to 
idiosyncrasies of the user, online learning algorithms 
have been designed that move the spatial position of 
the keys of the keyboard to adapt to these changes 
(Himberg et al., 2003). Studies also have analyzed 
how the trail of the mouse cursor can be used to 
analyze the nature of difficulties of the user (Keates 
et al., 2002). 

The protocols and schemas of different applica- 
tions might be cumbersome to learn for these people. 
For example, it might be difficult to understand and 
to learn to use an e-mail server (Sutcliffe et al., 
2003). Models of behavior can be induced from data 
to finite state models in order to predict and assist 
users in routine tasks. Flowever, the emphasis of 
researchers is toward removing the barriers of fixed 
protocols of behavior for these people and moving 
toward free, unordered input. However, this unor- 
dered input increases the search space and in- 
creases the order of complexity of the problem. 

It is clear from the previous discussion that 
assistive devices must be reactive and attentive (i.e. , 
they must view the subject over the user’s shoulders, 
reason about the user’s needs and motivations, and 
adapt the system to make sure that the user com- 
pletes his or her work with minimum effort). This 
requires induction of a user model from the informa- 
tion available from the previous usage of the system. 
This user model then can be used to predict future 
user actions. In Neill et al. (2000), the authors 
describe how sequence profiling can be done to 
learn to predict the direction in which the user is 
likely to move on a wheelchair interface. Motor 
disabilities may lead to problems like pressing more 
than one key at a time. To prevent these, hardware 
solutions such as a keyguard can be used, or a 
software layer can be used, which can use a lan- 
guage model to detect and correct errors (Trewin, 
1999,2002). 

Wisfids (Steriadis et al., 2003) can be seen as 
software agents that capture user events, drivers for 
accepting input signals that are addressed to them. 
They transform this input to useful information, 
which is further processed by software applications 
on which they are running. Row-column scanning 
can be made adaptive by making errors and reaction 
time as the parameters governing the scan delay 
(Simpson et al., 1999b). 



ICONIC ENVIRONMENTS 

Iconic interfaces are one of the popular methods that 
are used for communication by the cognitively chal- 
lenged population. These interfaces have many ad- 
vantages, vis-a-vis traditional text-based messaging 
systems. By being visual rather than textual, they are 
more intuitive and overcome the need to be literate 
in order to carry forward the communication. Due to 
the semantic uniformity of icons, they have been 
used for translation to Braille scripts (Burger, 1994) 
and for communication by deaf people (Petrie et al., 
2004). The universal nature of icons avoids the 
idiosyncrasies of different languages. They also 
speed up the rate of input by removing inferable 
constituents of communication, such as preposi- 
tions. These advantages have made icons pervasive 
in modern computing systems and ubiquitous in 
communication and assistive aids. 

However, this strength and richness comes at a 
cost. Use of icons for communication requires their 
interpretation, which is a non-trivial task. Use of 
simple icons makes the disambiguation easier. How- 
ever, it increases the size of the vocabulary. Search- 
ing in a large icon set using a scroll bar is likely to be 
difficult for motor-impaired people. Unordered input 
and leaving out syntactical cues such as prepositions 
make the search space larger. Use of syntax-di- 
rected methods presupposes the knowledge of dif- 
ferent case-roles. Therefore, if the iconic environ- 
ments are to be useful and practical, they must be 
random and provide the facilities for overloading the 
meaning of icons. Semantically overloaded icons, 
being polysemous, reduce the size of the vocabulary. 
Small vocabulary implies less search overhead. This 
is possible only if these interfaces are supplemented 
by robust and rich inference mechanisms to disam- 
biguate them. 

Demasco et al. (1992) were probably the first to 
attempt the problem. The authors address the prob- 
lem of expansion of compressed messages into 
complete intelligible natural language sentences. A 
semantic parser uses syntactic categories to make a 
conceptual representation of the sentence and passes 
it to a translation component. 

Extension of the work by Demasco has been 
reported in Abhishek et al. (2004). They explained 
the working of a prototype system, which could 
generate the natural language sentences from a set 
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of unordered semantically overloaded icons. In par- 
ticular, they formulated the problem as a constraint 
satisfaction problem over the case roles of the verb 
and discussed different knowledge representation 
issues, which must be tackled in order to generate 
semantically correct iconic sentences. In another 
approach for disambiguation, Albacete, et al. (1998) 
used conceptual dependency to generate natural 
language sentences. The formalism has the strength 
that it can handle considerable ambiguity. New 
concepts are generated from different operations on 
different concepts. The strength of the formalism 
derives from the fact that the authors define primi- 
tive connectors, which encode rules, which define 
how differently the operators modify categories. 
The aforementioned systems can support the over- 
loading of meanings to a large extent. While the 
former derives the overloading via use of complex 
icons, the latter uses operators to derive new con- 
cepts from existing concepts. These systems do not 
report empirical evaluation of their approaches. 
However, it can be seen as understanding the com- 
position that operators might be difficult for people 
suffering from brain impairments and attention span 
problems. 

To make a universal iconic system, it is important 
to consider cultural issues like symbols, colors, func- 
tionality, language, orthography, images, appear- 
ance, perception, cognition, and style of thinking. It 
has been verified experimentally that Minspeak 
icons need to be customized for users of different 
cultural contexts (Merwe et al. , 2004). The design of 
a culturally universal icon set is an open issue. 

FUTURE TRENDS 

In the years to come, we will witness the trend to 
move toward affective and adaptive computing. If 
machines have to take over humans in critical, 
health-related domains, they must be able to assess 
the physiological and emotional state of the user in 
real time. Progress in this field has been slow and 
steady. See Picard (2003) for challenges facing 
affective computing. 

As society ages, it is likely that people will suffer 
from many disabilities that come because of age. It 
also is known that most people suffer many problems 



in performing their day-to-day activities (Ross, 200 1 ). 
Devices that are used for people with progressive 
debilitating diseases must have the flexibility to 
change with users’ needs. In general, the need for 
these devices to be adaptive cannot be discounted. 
These devices should be able to accommodate a 
wide range of users’ preferences and needs. 
Wayfinders and devices helping to navigate an un- 
certain environment are difficult to come by. For 
example, it might be difficult for a deaf and blind 
person to cross a street. Research for creating 
practical, real-time, embedded, intelligent systems 
that can operate in uncertain and dynamic environ- 
ments is required. 

Evaluation and scaling up of iconic environments 
have become pressing needs. Generation of seman- 
tically correct sentences is the core issue in this 
field. Knowledge-based or domain-specific systems 
do not help the cause of creating communication 
across languages. We foresee the use of richer 
reasoning and learning methods in order to move 
away from the bottleneck of encoding of huge and 
diverse world knowledge. This will require funda- 
mental research in the semantics of language and 
our protocols of society. 

CONCLUSION 

In this article, we have discussed the issues concern- 
ing assistive devices. We then expounded the meth- 
ods and techniques used by researchers to solve 
them. Iconic interfaces play a crucial role in one of 
the sectors of rehabilitation. However, their strength 
can be exploited only if rich inference methods can 
be designed to disambiguate them. We then com- 
pared some of the methods used for this task. We 
concluded that fundamental research in all areas of 
natural language and brain deficiencies must be 
done to solve these problems satisfactorily. 
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KEY TERMS 

Adaptive User Interfaces: An interface that 
uses a user model to change its behavior or appear- 
ance to increase user satisfaction with time. These 
interfaces are used extensively in assistive devices. 

Assistive and Augmentative Communica- 
tion: A multi-disciplinary field that seeks to design 
devices and methods to alleviate the problems faced 
by physically challenged people running programs 
they don’t know and/or trust. 

Autism: A disease that leads to language disor- 
ders like delay in language development, repeated 
use of language, and inability to sustain conversation 
with others. 

Disambiguation in Iconic Interfaces: The 

process of context-sensitive, on-the-fly semantic 
interpretation of a sequence of icons. The process is 
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difficult because of the huge world knowledge re- 
quired for comprehending and reasoning about natu- 
ral language. 

Smart Houses: Houses that are equipped with 
self-monitoring assistive devices of many types. 
Smart houses are popular with old people. 

User Model: A model induced by machine- 
learning techniques from the available information 
and patterns of data from the user. This model is 
used by the system to predict future user actions. 



Widgets: The way of using a physical input 
device to input a certain value. These are exten- 
sively used and are popular in the case of people with 
neuromotor disabilities. 
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INTRODUCTION 

Empathy has been defined as, “An observer reacting 
emotionally because he perceives that another is 
experiencing or about to experience an emotion” 
(Stotland, Mathews, Sherman, Hannson, & 
Richardson, 1978). Synthetic characters (computer 
generated semi-autonomous agents corporeally 
embodied using multimedia and/or robotics, see Fig- 
ure 1) are becoming increasingly widespread as a 
way to establish empathic interaction between users 
and computers. For example, Feelix, a simple hu- 
manoid FEGO robot, is able to display different 
emotions through facial expressions in response to 
physical contact. Similarly, Kismet was designed to 
be a sociable robot able to engage and interact with 
humans using different emotions and facial expres- 

Figure 1. Synthetic characters 



sions. Carmen’s Bright Ideas is an interactive mul- 
timedia computer program to teach a problem-solv- 
ing methodology and uses the notion of empathic 
interactions. Research suggests that synthetic char- 
acters have particular relevance to domains with 
flexible and emergent tasks where empathy is cru- 
cial to the goals of the system (Marsella, Johnson, & 
FaBore, 2003). 

Using empathic interaction maintains and builds 
user emotional involvement to create a coherent 
cognitive and emotional experience. This results in 
the development of empathic relations between the 
user and the synthetic character, meaning that the 
user perceives and models the emotion of the agent 
experiencing an appropriate emotion as a conse- 
quence. 




FearNot (Hall et al., 2004) 



Feelix (Canamero, 2002; 
Canamero & Fredslund, 
2000) & Kismet 
(Breazeal & Scassellati, 
1999) 
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BACKGROUND 

A number of synthetic characters have been devel- 
oped where empathy and the development of em- 
pathic relations have played a significant role, in- 
cluding theatre (Bates, 1994), storytelling (Machado, 
Paiva, & Prada, 2001) and personal, social, and 
health education (Silverman, Holmes, Kimmel, Ivins, 
& Weaver, 2002). Applications such as FearNot 
(Hall et al., 2004b) and Carmen’s Bright Ideas 
(Marsella et al., 2003) highlight the potential of 
synthetic characters for exploring complex social 
and personal issues, through evoking empathic reac- 
tions in users. 

In a similar vein, robotics research has started to 
explore both the physical and behavioural architec- 
ture necessary to create meaningful empathic inter- 
actions with humans. This has included examining 
robot personality traits and models necessary for 
empathic relations (Fong, Nourbakhsh, & Dautenhahn, 
2003) and the design of robotic facial expressions 
eliciting basic emotions to create empathic interac- 
tions (e.g., Canamero, 2002). Empirical evaluations 
have shown that humans do express empathy towards 
robots and have the tendency to treat robots as living 
entities (e.g., Sparky, a social robot; Scheeff, Pinto, 
Rahardja, Snibbe, & Tow, 2002). 

The results from research into empathic interac- 
tion with synthetic characters suggest that it is 
possible to evoke empathic reactions from users and 
that this can result in stimulating novel interactions. 
Further, research identifies that in empathising with 
characters a deeper exploration and understanding 
of sensitive social and personal issues is possible 
(Dautenhahn, Bond, Canamero, & Edmonds, 2002). 
This can lead to real-life impacts such as the devel- 
opment of constructive solutions, that is, Carmen’s 
Bright Ideas (Marsella et al., 2003). 

However, it remains unclear as to how empathy 
can be evoked by interaction and here, we focus on 
the impact of similarity on evoking empathy in child 
users. This article reports findings obtained in the 
VICTEC (Virtual ICT with Empathic Characters) 
project (Aylett, Paiva, Woods, Hall, & Zoll, 2005) 
that applied synthetic characters and emergent nar- 
rative to Personal and Health Social Education 
(PHSE) for children aged 8-12, in the UK, Portugal, 
and Germany, through using 3D self-animating char- 
acters to create improvised dramas. In this project, 



empathic interaction was supported using FearNot 
(Fun with Empathic Agents to Reach Novel Out- 
comes in Teaching). This prototype allowed children 
to explore physical and relational bullying issues, and 
coping strategies in a virtual school populated by 
synthetic characters. The main issue this article 
addresses is whether the level of similarity per- 
ceived by a child with a character has an impact on 
the degree of empathy that the child feels for the 
character. 



WHY SIMILARITY MATTERS 

Similarity is the core concept of identification 
(Lazowick, 1955) and a major factor in the develop- 
ment and maintenance of social relationships (Hogg 
& Abrams, 1988). The perception of similarity has 
significant implications for forming friendships, with 
studies identifying that where children perceive them- 
selves as similar to another child, that they are more 
likely to choose them as friends ( Aboud & Mendelson, 
1998). The opposite has also been shown to be true, 
with children disliking those who are dissimilar to 
them in terms of social status and behavioural style 
(Nangle, Erdley, & Gold, 1996). This dislike of 
dissimilarity is especially evident for boys. 

Perceived similarity as a basis for liking and 
empathising with someone is also seen in reactions 
to fictional characters, where the perception of a 
character as similar to oneself and identifying with 
them will typically result in liking that character, and 
empathising with their situation and actions. This can 
be frequently seen with characters portrayed in 
cinema and television (Hoffner & Cantor, 1991; 
Tannenbaum & Gaer, 1965). Further, people are 
more likely to feel sorry for someone (real or a 
character) if they perceive that person as similar to 
themselves (von Feilitzen & Linne, 1975). 

To investigate the impact of similarity on children’ s 
empathic reactions to the synthetic characters in 
FearNot, we performed a large scale study, further 
discussed in Aylett et al. (2005). Liking someone is 
strongly influenced by perceived similarity and re- 
search suggests that if a child likes a character they 
are more likely to empathise with them. Thus, in 
considering the impact of similarity on the evocation 
of empathy we looked at perceived similarity of 
appearance and behaviour and their impact on the 
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like/dislike of characters, as well as two empathic 
measures (feeling sorry for a character and feeling 
angry with a character). 

THE FEARNOT STUDY 

FearNot was trialed at the “Virtually Friends” event at 
the University of Hertfordshire, UK, in June 2004. 
Three hundred and forty-five children participated in 
the event: 172 male (49.9%) and 173 female (50.1%). 
The sample age range was 8 to 11, mean age of 9.95 
(SD: 0.50). The sample comprised of children from a 
wide range of primary schools in the South of England. 

Method 

Two classes from different schools participated each 
day in the evaluation event. All children individually 



interacted with FearNot on standard PCs. FearNot 
began with a physical bullying scenario comprised 
of three episodes and children had the role of an 
advisor to help provide the victim character with 
coping strategies to try and stop the bullying 
behaviour. After the physical scenario, children had 
the opportunity to interact with the relational sce- 
nario showing the drama of bullying among four 
girls. After the interaction children completed the 
Agent Evaluation Questionnaire (AEQ). This was 
designed in order to evaluate children’ s perceptions 
and views of FearNot, see Table 1. This question- 
naire is based on the Trailer Questionnaire (Woods, 
Hall, Sobral, Dautenhahn, & Wolke, 2003) that has 
been used extensively with a non-interactive FearNot 
prototype as is reported in Hall et al. (2004b). 
Questions relating to choosing characters were 
answered by selecting character names (posters of 
the characters were displayed with both a graphic 



Table 1. Content of the agent evaluation questionnaire 



Aspect 


Nature of Questions 


Character 

preference 


• Character liked most 

• Character liked least 

• Prime character, who they would choose to be 

• Character with whom child would most like to be friends 


Character 

attributes 


• realism of movement (realistic to unrealistic) 

• smoothness of movement (smooth to jerky) 

• clothes appreciation (looked good to looked strange), liking (liked to 
did not like) and similar to own (similar to what you wear to 
different to what you wear) 

• character age 


Character 

conversations 


• conversation content (believable to unbelievable) 

• conversation interest (interesting to boring) 

• content similarity to own conversations (similar to different) 


Interaction 

impact 


• victims acceptance of advice (followed to paid no attention) 

• helping victim (helped a lot to not at all) 


Bullying 

Storyline 


• storyline belie vability (believable to unbelievable) 

• storyline length (right length to too long) 


Similarity 


• character that looks most and least like you 

• character that behaves most and least like you 


Empathy towards 
characters 


• Feeling sorry for characters and if yes which character 

• Feeling angry towards the characters and if yes which character 

• Ideomotoric empathy based on expected behaviour 
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Figure 2. Liked most character 




□ boys 

□ girls 



Key: Who played which character in the drama? 

John: Victim Paul: Defender Luke: Bully Frances: Victim Janet: Bully 
Assistant Sarah: Bully Martina: Defender 



and the name as an aide memoire). Children’ s views 
were predominantly measured according to a 5 point 
Likert scale. 

Results 

Gender was a significant factor in the selection of 
which character was most similar in physical ap- 
pearance to you, with almost all of the children 
choosing a same gender character or none. There 
was a significant association for those children who 
felt that a same gender character looked like them 
and also liked a same gender character: boys (X = 
23.534, (8, 108), p = 0.001) girls (X = 24.4, (4, 89), p 
< 0.001), meaning that boys liked male characters 
that looked like them, and girls liked female charac- 
ters that resembled them. 

As can be seen from Figure 3, children liked 
those characters who looked the most similar to 
them, if the character played a defender, neutral or 
victim role. Flowever, where the character was a 
bully, children were not as likely to like the character 
that they were similar to in appearance, particularly 
among the girls. Thirty-five percent of boys who 
looked like Luke liked him the most, although almost 
a third of the girls stated that they resembled a 



female bully character in appearance, only 4 (2.5%) 
liked them the most. 

There were no significant differences related to 
whom you looked like and disliking characters, with 
the dislike clearly being based on alternative factors 
to appearance. Similar to the results of Courtney, 
Cohen, Deptula, and Kitzmann (2003), children dis- 
liked the bullies (aggressors) the most, followed by 
the victims and then the bystanders. Most children 
disliked Luke, the physical bullying protagonist fol- 
lowed by Sarah, the relational bully, then the victims. 
As in other results (Flail et al., 2004), children paid 
scant attention to the bully assistants, and only 5% of 
children disliked Janet the most. 

A significant association was found between the 
character children felt looked the most like them and 
feeling sorry for characters in the drama. Looking 
like any of the female characters (e.g., being fe- 
male) is more likely to result in feeling sorry for the 
victims, with over 80% of those who felt that they 
looked like any of the female characters feeling 
sorry for the victims. If children (mainly boys) felt 
that Luke (62%) looked the most like them, they 
expressed the least amount of empathy towards the 
characters in the dramas, however, only 67% of 
those who felt that they looked like John felt sorry for 
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Figure 3. Character child looked most similar to in appearance and liked the most 




the victims, as compared to 87% of those (all fe- 
male) who felt they looked like Frances. 

A significant association was found between the 
character children felt looked most like them and 
feeling anger towards characters in the dramas. 
Again this result is related to gender, with signifi- 
cantly more girls than boys feeling anger towards the 
characters. However, the results still indicate that 
appearance similarity could have an impact on the 
evocation of anger. Boys who stated that Luke 
looked the most similar to them felt the least amount 
of anger towards characters (46%), followed by 
John (6 1 % ) and Paul (7 8 %). For the girls , those who 
felt they looked most like Sarah the bully were most 
likely to be angry (95.5%) compared to 7 1 % of those 
who looked most similar to Frances (the victim), 
suggesting that girls were more likely to be angry if 
the bully were similar to them, whereas boys were 
less likely to be angry if the bully were similar to 
them. For those children who stated that none of the 
characters looked like them, 66% identified that they 
felt angry, reflecting the higher number of boys than 
girls in this group. 

DISCUSSION 

The results indicate that greater levels of empathy 
are evoked in children if they perceive that they are 
similar to the characters. This would suggest that 



when developers seek to evoke empathic interaction 
that they should attempt to create synthetic charac- 
ters that are similar to the intended users. Interest- 
ingly, our results also highlighted that while looking 
like a character may result in you being more inclined 
to like them, if they exhibit morals, ethics, and 
behaviours that are socially unacceptable, such as 
bullying, this can have a significant impact on your 
liking of that character. This reflects real-world 
behaviour, with all reported studies of children’s 
reactions to aggressive behaviour/bullying support- 
ing the view that children are more likely to dislike 
aggressors the most, followed by victims and then 
bystanders (Courtney et al., 2003). Our results 
supported this view. 

Trusting and believing in synthetic characters 
and possible impact on real-life behaviour appears to 
be linked to perceived similarity. However, although 
perceived similarity may be a major factor in en- 
gagement with synthetic characters, there is also 
considerable evidence from the performing arts that 
engagement can readily occur with characters very 
dissimilar to oneself. 



FUTURE TRENDS 

This study has highlighted the potential for similarity 
and empathic interaction; however, further research 
is needed in this area. Future research directions 
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include the impact of greater physical similarity on 
empathic interaction, with research in virtual hu- 
manoids considering more realistic and similar fea- 
tures and expressions (Fabri, Moore, & Hobbs, 
2004). The importance of cultural similarity is also 
being investigated (Hayes-Roth, Maldonado, & 
Moraes, 2002) with results suggesting the need for 
high cultural homogeneity between characters and 
their users. While similarity may be of benefit, there 
remains the spectre of the “Uncanny Valley” (Woods, 
Dautenhahn, & Schulz, 2004), for example, a recent 
study examining children’s perceptions of robot 
images revealed that “pure” human-like robots are 
viewed negatively compared to machine-human-like 
robots. Research is needed into determining what 
aspects of similarity need to be provided to enable 
higher levels of empathic interaction with synthetic 
characters, considering different modalities, senses, 
and interaction approaches. 

CONCLUSION 

This article has briefly considered empathic interac- 
tion with synthetic characters. The main focus of 
this article was on the impact of similarity on evoking 
empathic interaction with child users. Results sug- 
gest that if children perceive that they are similar to 
a synthetic character in appearance and/or behaviour, 
that they are more likely to like and empathise with 
the character. Future research is needed to gain 
greater understanding of the level and nature of 
similarity required to evoke an empathic interaction. 
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KEY TERMS 

Autonomous Robot: A robot that is capable of 
existing independently from human control. 

Emergent Narrative: Aims at solving and/or 
providing an answer to the narrative paradox ob- 
served in graphically represented virtual worlds. 
Involves participating users in a highly flexible real- 
time environment where authorial activities are 
minimised and the distinction between authoring- 
time and presentation-time is substantially removed. 
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Empathic Agent: A synthetic character that 
evokes an empathic reaction in the user. 

Empathy: “An observer reacting emotionally 
because he perceived that another is experiencing or 
about to experience an emotion.” 

Synthetic Character: Computer generated semi- 
autonomous agent corporally embodied using multi- 
media and/or robotics. 



Uncanny Valley: Feelings of unease, fear, or 
revulsion created by a robot or robotic device that 
appears to be, but is not quite, human-like. 

Virtual Learning Environment: A set of teach- 
ing and learning tools designed to enhance a student’ s 
learning experience by including computers and the 
Internet in the learning process. 



310 



311 



Improving Dynamic Decision Making through 
HCI Principles 

Hassan Qudrat-Ullah 

York University, Canada 



INTRODUCTION 

CSBILEs allow the compression of time and space 
and provide an opportunity for practicing managerial 
decision making in a non-threatening way (Issacs & 
Senge, 1994) . In a computer simulation-based inter- 
active learning environments (CSBILEs), decision 
makers can test their assumptions, practice exerting 
control over a business situation, and learn from the 
immediate feedback of their decisions. CSBILE’s 
effectiveness is associated directly with decision- 
making effectiveness; that is, if one CSBILE im- 
proves decision-making effectiveness more than 
other CSBILEs, it is more effective than others. 
Despite an increasing interest in CSBILEs, empiri- 
cal evidence to their effectiveness is inconclusive 
(Bakken, 1993; Diehl & Sterman, 1995; Moxnes, 
1998). The aim of this article is to present a case for 
HCI design principles as a viable potential way to 
improve the design of CSBILEs and, hence, their 
effectiveness in improving decision makers’ perfor- 
mance in dynamic tasks. This article is organized as 
follows: some background concepts are presented 
first; next, we present an assessment of the prior 
research on (i) DDM and CSBILE and (ii) HCI and 
dynamic decision making (DDM); the section on 
future trends presents some suggestion for future 
research. This article concludes with some conclu- 
sions. 



BACKGROUND 

Dynamic Decision Making 

What is dynamic decision making (DDM)? Dynamic 
decision-making situations differ from those tradi- 
tionally studied in static decision theory in at least 
three ways: 



• A number of decisions are required rather than 
a single decision. 

• Decisions are interdependent. 

• The environment changes either as a result of 
decisions made or independently of them both 
(Edwards, 1962). 

Recent research in system dynamics has charac- 
terized such decision tasks by multiple feedback 
processes, time delays, non-linearities in the rela- 
tionships between decision task variables, and un- 
certainty (Bakken, 1993; Hsiao, 2000; Sengupta & 
Abdel-Hamid, 1993; Sterman, 1994). 

We confront dynamic decision tasks quite rou- 
tinely in our daily life. For example, driving a car, 
flying an airplane, managing a firm, and controlling 
money supply are all dynamic tasks (Diehl & Sterman, 
1995). These dynamic tasks are different from static 
tasks such as gambling, locating a park on a city map, 
and counting money. In dynamic tasks, in contrast to 
static tasks, multiple and interactive decisions are 
made over several time periods whereby these 
decisions change the environment, giving rise to new 
information and leading to new decisions (Brehmer, 
1990; Forrester, 1961; Sterman, 1989a, 1994). 

CSBILEs 

We use CSBILE as a term sufficiently general to 
include microworlds, management flight simulators, 
learning laboratories, and any other computer simu- 
lation-based environments. The domain of these 
terms is all forms of action whose general goal is the 
facilitation of decision making and learning in dy- 
namic tasks. This conception of CSBILE embodies 
learning as the main purpose of a CSBILE (Davidsen, 
2000; Lane, 1995; Moxnes, 1998; Sterman, 1994). 
Computer-simulation models, human intervention, 
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and decision making are considered the essential 
components of a CSBILE (Bakken, 1993; Cox, 
1992; Davidsen, 1996; Davidsen & Spector, 1997; 
Lane, 1995; Sterman, 1994). 

Under this definition of CSBILE, learning goals 
are made explicit to decision makers. A computer- 
simulation model is built to represent adequately the 
domain or issue under study with which decision 
makers can induce and experience real worldlike 
responses (Lane, 1995). Human intervention refers 
to active keying in of the decisions by decision 
makers into the computer-simulation model via a 
decision-making environment or interface. Human 
intervention also arises when a decision maker 
interacts with a fellow decision maker during a group 
setting session of a CSBILE or when a facilitator 
intervenes either to interact with the simulated sys- 
tem or to facilitate the decision makers. 



DDM AND CSBILEs 

Business forces, such as intensifying competition, 
changing operating environments, and enormously 
advancing technology, have made organizational 
decision making a complex task (Diehl & Sterman, 
1995;Moxnes, 1998; Sterman, 1989b), and all chal- 
lenge traditional management practices and beliefs. 
The development of managerial skills to cope with 
dynamic decision tasks is ever in high demand. 
However, the acquisition of managerial decision- 
making capability in dynamic tasks has many barri- 
ers (Bakken, 1993). On the one hand, the complexity 
of corporate and economic systems does not lend 
itself well to real-world experimentation. On the 
other hand, most of the real-world decisions and 
their outcomes hardly are related in both time and 
space, which compounds the problem of decision 
making and learning in dynamic tasks. 

However, computer technology, together with 
the advent of new simulation tools, provides a poten- 
tial solution to this managerial need. Lor instance, 
CSBILEs are often used as decision support sys- 
tems in order to improve decision making in dynamic 
tasks by facilitating user learning (Davidsen & 
Spector, 1997; Lane, 1995). CSBILEs allow the 
compression of time and space, providing an oppor- 



tunity for managerial decision making in a non- 
threatening way (Issacs & Senge, 1994). 

In the context of CSBILEs, how well do people 
perform in dynamic tasks? The literature on DDM 
(Lunke, 1995; Hsiao, 2000; Kerstholt & Raaijmakers, 
1997; Qudrat-Ullah, 2002, Sterman, 1989a, 1989b) 
and learning in CSBILEs (Bakken, 1993; Keys & 
Wolf, 1990; Lane, 1995; Langley & Morecroft) 
provides almost a categorical answer: very poorly. 
Very often, poor performance in dynamic tasks is 
attributed to subjects' misperceptions of feedback 
(Diehl & Sterman, 1995; Moxnes, 1998; Sterman, 
1989b). The misperception of feedback (MOL) per- 
spective concludes that subjects perform poorly 
because they ignore time delays and are insensitive 
to the feedback structure of the task system. The 
paramount question becomes the following: Are 
people inherently incapable of managing dynamic 
tasks? Contrary to Sterman’s (1989a, 1989b) MOL 
hypothesis, an objective scan of real-world decisions 
would suggest that experts can deal efficiently with 
highly complex dynamic systems in real life; for 
example, maneuvering a ship through restricted 
waterways (Kerstholt & Raaijmakers, 1997). The 
expertise of river pilots seems to consist more of 
using specific knowledge (e.g., pile moorings, buoys, 
leading lines) that they have acquired over time than 
in being able to predict accurately a ship’s move- 
ments (Schraagen, 1994). This example suggests 
that people are not inherently incapable of better 
performance in dynamic tasks but that decision 
makers need to acquire the requisite expertise. 
Thus, in the context of CSBILEs, equating learning 
as a progression toward a prototypic expertise 
(Sternberg, 1995) is a very appropriate measure. 
Then, the most fundamental research question for 
DDM research seems to be how to acquire prototypic 
expertise in dynamic tasks. A solution to this ques- 
tion effectively would provide a competing hypoth- 
esis to MOL hypothesis: people will perform better 
in dynamic tasks if they acquire the requisite exper- 
tise. We term this competing hypothesis as the 
acquisition-of-expertise (AOE) hypothesis. The fol- 
lowing section explains how the human-computer 
interface (HCI) design may help to acquire prototypic 
expertise in dynamic tasks. 
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HCI AND DDM 

The successes of the HCI design in disciplines such 
as management information system, information sci- 
ence, and psychology, all sharing the common goal of 
improving the organizational decision making, are 
considerable (Carey et al., 2004). However, despite 
the fact that all CSBILEs must have an HCI element, 
the role of HCI design in improving DDM has re- 
ceived little attention from DDM researchers. Only 
recently, Howie et al. (2000) and Qudrat-Ullah (2002) 
have directed DDM research to this dimension. 
Howie et al. (2000) has investigated empirically the 
impact of an HCI design based on human factor 
guidelines on DDM. The results revealed that the 
new interface design based on the following human- 
computer interaction principles led to improved per- 
formance in the dynamic task compared to the origi- 
nal interface: 

• Taking advantage of people’s natural tenden- 
cies (e.g., reading from left to right and from top 
to bottom); 

• T aking advantage of people ’ s prior knowledge 
(e.g., through the use of metaphors); 

• Presenting information in a graphical manner to 
tap into people’s pattern-recognition capabili- 
ties; and 

• Making the relationship among data more sa- 
lient so that people can develop a better mental 
model of the simulation. 

In his empirical study, Qudrat-Ullah (2001) stud- 
ied the effects of an HCI design based on learning 
principles (Gagne, 1995) on DDM. The results showed 
that the CSBILE with HCI design based on the 
following learning principles was effective on all four 
performance criteria: 

• Gaining attention (e.g., the decision makers, at 
the very first screen of the CSBILE, are pre- 
sented with a challenging task with the help of 
a text window and background pictures of rel- 
evant screens to grab their attention and arouse 
interest and curiosity); 

• Informing of the objectives of the task (e.g., the 
objective is presented in clear terms: How does 
the tragedy of the commons occur?); 



• Stimulating the recall of prior knowledge (e.g. , 
the pre-play test helps stimulate recall of prior 
knowledge); 

• Presenting the content systematically (e.g., 
text and objects are used in the CSBLE for 
material presentation); 

• Providing learning guidance (e.g., the decision 
makers are led to an explanation interface as 
a guidance for learning); 

• Eliciting performance (e.g., the navigational 
buttons of the CSBILE allow the decision 
maker to go back and forth from generic to 
specific explanation and vice versa, facilitat- 
ing the performance elicitation); 

• Providing feedback (e.g., the pop-up window 
messages provide feedback to the decision 
makers as such); 

• Assessing performance (e.g., the post-test is 
designed to assess the performance of the 
decision maker); 

• Enhancing retention and transfer (e.g., the 
debriefing session of the CSBILE augments 
the last instructional event, enhancing reten- 
tion and transfer of knowledge); 

• The new design improves task performance; 

• Helps the user learn more about the decision 
domain; 

• Develop heuristics; and 

• Expends less cognitive effort, a support for 
AOE hypothesis. 

FUTURE TRENDS 

Although any generalization based on just two stud- 
ies may not be that realistic, there appears a clear 
call to reassess the earlier studies on DDM support- 
ing MOF hypothesis. By employing HCI design 
principles in CSBILEs, future studies should ex- 
plore the following: 

• Cost Economics: To what extent do the HCI 
design-based CSBILEs help dynamic decision 
makers to cope with limited information-pro- 
cessing capacities? 

• Reducing Misperception of Feedback: In- 
creasing task salience and task transparency 
in dynamic tasks results in improved perfor- 
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mance (Issacs & Senge, 1994). Are HCI de- 
sign-based CSBILEs effective in reducing 
misperceptions of feedback? 

• Supporting Learning Strategies: Success- 
ful decision makers in dynamic tasks develop 
perceptually oriented heuristics (Kirlik, 1995). 
To what extent do the HCI design-based 
CSBILEs help dynamic decision makers to 
develop perceptually oriented decision heuris- 
tics? 



CONCLUSION 

DDM research is highly relevant to the managerial 
practice (Diehl & Sterman, 1995; Kerstholt & 
Raaijmakers, 1997; Kleinmuntz, 1985). We need 
better tools and processes to help the managers cope 
with the ever-present dynamic tasks. This article 
makes a case for the inclusion of HCI design in any 
CSBILE model aimed at improving decision making 
in dynamic tasks. We believe that the lack of empha- 
sis on HCI design in a CSBILE resulted, at least in 
part, in poor performance in dynamic tasks by people. 
Moreover, HCI design methods and techniques can 
be used to reduce the difficulties people have in 
dealing with dynamic tasks. At the same time, we 
have made the case to reassess the earlier studies on 
dynamic decision making supporting the MOF hy- 
pothesis. Perhaps by focusing more attention on 
improved interface design for CSBILEs, we can 
help people make better organizational decisions. 
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KEY TERMS 

Acquisition-of-Expertise Hypothesis: States 
that people will perform better in dynamic tasks, if 
they acquire the requisite expertise. 

Feedback: It is a process whereby an input 
variable is fed back by the output variable. For 
example, an increased (or decreased) customer 
base leads to an increase (or decrease) in sales from 
word of mouth, which then is fed back to the 
customer base, increasingly or decreasingly. 

Mental Model: A mental model is the collection 
of concepts and relationships about the image of the 
real-world things we carry in our heads. For ex- 
ample, one does not have a house, city, or gadget in 
his or her head, but a mental model about these 
items. 

Non-Linearity: A non-linearity exists between 
a cause (decision) and effect (consequence), if the 
effect is not proportional to cause. 

Prototypic Expertise: The concept of prototypic 
expertise views people neither as perfect experts 
nor as non-experts, but somewhere in between both 
extremes. 
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Requisite Expertise: Having an adequate un- 
derstanding of the task that helps to manage the task 
successfully. 

Simulated System: A simplified, computer simu- 
lation-based construction (model) of some real- 
world phenomenon (or the problem task). 



Time Delays: Often, the decisions and their 
consequences are not closely related in time. For 
instance, the response of gasoline sales to the changes 
in price involves time delays. If prices go up, then 
after a while, sales may drop. 
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INTRODUCTION 

For people with motor impairments, access to, and 
independent control of, a computer can be an impor- 
tant part of everyday life. However, in order to be of 
benefit, computer systems must be accessible. 

Computer use often involves interaction with a 
graphical user interface (GUI), typically using a 
keyboard, mouse, and monitor. However, people 
with motor impairments often have difficulty with 
accurate control of standard input devices (Trewin 
& Pain, 1999). Conditions such as cerebral palsy, 
muscular dystrophy, and spinal injuries can give rise 
to symptoms such as tremor, spasm, restricted range 
of motion, and reduced strength. These symptoms 
may necessitate the use of specialized assistive 
technologies such as eye-gaze pointing or switch 
input (Alliance for Technology Access, 2000). At 
the same time, specialized technologies such as 
these can be expensive and many people simply 
prefer to use standard input devices (Edwards, 
1995; Vanderheiden, 1985). Those who continue to 
use standard devices may expend considerable time 
and effort performing basic actions. 

The key to developing truly effective means of 
computer access lies in a user-centered approach 
(Stephanidis, 2001). This article discusses methods 
appropriate for working with people with motor 
impairments to obtain information about their wants 
and needs, and making that data available to inter- 
face designers in usable formats. 



BACKGROUND 

In a recent research study commissioned by 
Microsoft, Forrester Research, Inc. (2003) found 
that 25% of all working-age adults in the United 
States had some form of dexterity difficulty or 
impairment and were likely to benefit from acces- 
sible technology. This equates to 43.7 million people 
in the United States, of whom 31.7 million have mild 
dexterity impairments and 12 million have moderate 
to severe impairments. 

If retirees had been included in the data sample, 
the number of people who would benefit from acces- 
sible technology would be even higher as the preva- 
lence of motor impairments, and thus the need for 
such assistance, increases noticeably with age 
(Keates & Clarkson, 2003). As the baby-boomer 
generation ages, the proportion of older adults is set 
to increase further. 

The global aging population is growing inexorably 
(Laslett, 1996). By 2020, almost half the adultpopu- 
lation in the United Kingdom will be over 50, with the 
over- 80s being the most rapidly growing sector 
(Coleman, 1993). Governments are responding to 
this demographic change. Antidiscrimination legis- 
lation has been enacted in many countries such as 
the United States with the 1990 Americans with 
Disabilities Act, and the United Kingdom with the 
1995 Disability Discrimination Act. 

These pieces of legislation often allow users who 
are denied access to a service to litigate against the 
service provider. They are mechanisms for enforc- 
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ing basic rights of access. A complementary “car- 
rot” approach to this legislative “stick” is the change 
in governmental purchasing policy. In the United 
States, the Section 508 amendment to the 1998 
Workforce Investment Act stipulates minimum lev- 
els of accessibility required for all computer systems 
purchased by the U.S. Federal Government, the 
world’ s largest purchaser of information-technology 
equipment. Many other national and regional gov- 
ernments are adopting similar purchasing policies. 

Research Methods for 
Improving Accessibility 

To provide truly accessible systems, it is necessary 
to examine the user experience as a whole and to 
adopt design best practices wherever possible. To 
this end, standards are being developed, such as the 
forthcoming British Standards Institute BS7000 Part 
6, “Guide to Managing Inclusive Design,” that focus 
on wider interpretations of accessibility throughout 
the complete lifetime of products. 

In addition, heuristic evaluations of prototypes 
can reveal fundamental physical-access issues. 
Accessibility standards like the U.S. Section 508 
guidelines (http://www.section508.gov/), or the W3C 
Web Accessibility Initiative Web Content Accessi- 
bility guidelines and checklists (Chisholm, 
Vanderheiden, & Jacobs, 1999) are readily available 
to assist with establishing the heuristics. Examples 
include testing whether keyboard-only access is 
possible and examining the size of targets the user is 
expected to click on. Addressing these issues in 
advance of user testing will allow the maximum 
benefit to be gained from the user sessions them- 
selves. 

Ideally, users with disabilities should be included 
in product design and usability testing early and 
often. Many user-interface designers are not ad- 
equately equipped to put this into practice (Dong, 
Cardoso, Cassim, Keates, & Clarkson, 2002). Most 
designers are unfamiliar with the needs of people 
with motor impairments and are unsure how to 
contact such users or include them in studies. The 
following sections outline some specific consider- 
ations and techniques for including this population in 
user studies. 



SAMPLING USERS 

For traditional user studies, the users would typically 
be customers or employees and would often be 
readily at hand. However, when considering users 
with a wide range of capabilities, it is often neces- 
sary to commit explicit effort and resource to seek- 
ing out potential participants. 

Good sources of users include charitable organi- 
zations, social clubs, and support groups, which can 
be found in most towns and cities. However, even 
when sources of users have been identified, effort 
still needs to be expended in trying to identify candi- 
date users who match the user-sampling profiles. 
Sample sizes are inevitably small since volunteers 
must be reasonably typical users of the product in 
addition to having a physical impairment. 

Sampling Users by Condition 

There are many possible approaches for identifying 
and sampling potential users. The most obvious is to 
identify users based on their medical condition. The 
advantage of this approach is that someone’s medi- 
cal condition is a convenient label for identifying 
potential users. Not only are most users aware of 
any serious condition, especially one that affects 
their motor capabilities, but it also makes locating 
users easier. For example, many charitable organi- 
zations are centered on specific medical conditions, 
such as cerebral palsy, muscular dystrophy, or 
Parkinson’s disease. 

The disadvantage of this approach is that many 
of these conditions are highly variable in terms of 
their impact on the user ’ s functional capabilities, and 
so a degree of user-capability profiling is still re- 
quired. 

Sampling Users by Capability 

The alternative approach to sampling users is not to 
focus on their medical condition, but to instead look 
at their capabilities. The advantage of this approach 
is that the accessibility of the resultant product 
should then be independent of the medical condition. 
The disadvantage of this approach is that more user- 
capability profiling is required at the outset to estab- 
lish where each user sits in the capability continuum. 
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The most popular approaches to sampling issues 
are to either find users that represent a spread across 
the target population, or to find users that sit at the 
extremes of that population. The advantage of work- 
ing with users that represent a spread across the 
population is that they ensure that the assessment 
takes the broadest range of needs into account. The 
disadvantage, though, is that there is not much depth 
of coverage of users who may experience difficulties 
in accessing the product. 

The advantage of working with the extreme users 
is that the user-observation sessions will almost 
certainly discover difficulties and problems with the 
interaction. However, the disadvantage is that there 
is a real danger of discovering that particular users 
cannot use the product and little else beyond that. For 
example, giving a drawing program with an on- 
screen toolbox to a user who cannot use a mouse 
yields the obvious difficulty arising from the inability 
to choose a drawing tool. However, subsequent 
questions about the tools themselves are not possible 
because of the overriding difficulty of choosing them. 

Of more use is to identify users who are more 
likely to be “edge” cases: those who are on the 
borderline of being able to use the product, and who 
would commonly be accepted as being able to use the 
interface (Cooper, 1999). Going back to the example 
of someone with a motor impairment attempting to 
use a drawing program, while someone unable to use 
a mouse at all would certainly not be able to use the 
drawing tools, someone with a moderate tremor may 
be able to do so. Even more interestingly, that person 
might be able to access some tools and not others, and 
thus it is possible to begin to infer a wide range of very 
useful data from such a user. On top of that, if the 
user cannot use the tools, then it may be inferred that 
any user with that level of motor impairment or worse 
will not be able to use them, automatically encom- 
passing the users who cannot control a mouse in the 
assessment. 



WORKING WITH USERS 

As with all user studies, the participating users need 
to be treated with respect and courtesy at all times. 
When dealing with users with more severe impair- 
ments, it is especially important to be sensitive to their 



needs, and accommodations in study design may be 
necessary. 

Location 

Many usability tests are carried out in a laboratory. 
For people with physical impairments, this is not 
always ideal. Individuals may have made many 
modifications to their home or work environments 
to allow them to work comfortably and accurately, 
and this will often be difficult to reproduce in a lab 
session. The user may not be able or willing to bring 
assistive technologies they use at home. Further- 
more, the user’s impairment may make travel to 
sessions difficult and/or physically draining. When 
laboratory sessions are carried out, researchers 
should consider providing the following facilities, 
and plan to spend time at the start of each session 
making sure that the user is comfortable. 

• Table whose height can be easily adjusted 

• Moveable and adjustable chair 

• Cordless keyboard and mouse that can be 
placed on a user’s wheelchair tray 

• Keyboard whose slope and orientation can be 
adjusted and then fixed in place on the table 

• Alternative pointing devices such as a trackball 

• Adjustments to the key-repeat delay, key- 
repeat rate, mouse gain, double-click speed, 
and any other software accessibility features 
to users’ preferred settings 

A compromise approach that can work well is to 
hold sessions at a center specializing in computer 
access for people with disabilities, where such 
equipment is already available. Users can also be 
encouraged to bring their own devices when prac- 
tical (e.g., special keyboard and key guard). 

If tests can be carried out at the user’s own 
location, then a more realistic usability evaluation 
can be performed. For users who employ special- 
ized assistive technologies such as head pointing or 
eye gaze, it may be useful to schedule time for the 
user to explain these technologies to the research- 
ers as it may be difficult to understand what is 
happening if the operation of this device is unfamiliar. 

Remote testing is an increasingly popular tech- 
nique in which the user carries out a task from his 
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or her own environment while the researcher is not 
physically present. This approach is sometimes nec- 
essary when local users cannot be found. Its effi- 
cacy depends on the kind of evaluation to be per- 
formed. Telephone, e-mail, or chat can be used as a 
means of communication, or the task can be entirely 
self-driven through a Web site or other software. 
Gathering information from users by e-mail has the 
advantage that users can take as long as they need 
to prepare responses, and can be useful for those 
whose speech is difficult to understand. The remote 
evaluation of products or product prototypes is also 
possible, but the quality of the information received 
will often be poorer. For example, if a user takes a 
long time to perform a task, the researcher does not 
know whether this is because they had trouble in 
deciding what to click on, trouble in clicking on the 
icon, or because they were interrupted by a family 
member. Detailed recordings of the user’s input 
activities or the use of a camera can be helpful in this 
respect. 

Methods for Gathering User Data 

The following list represents a summary of typical 
methods used by researchers to elicit user wants and 
investigate product usability. 

• Questionnaires: A series of preprepared 
questions asked either in writing or orally 

• Interviews: Either prestructured or free-form 

• User Observation: Watching the users per- 
form a task, either using an existing product or 
a prototype 

• Focus Groups: Discussion groups addressing 
a specified topic 

• Contextual Inquiry: Interviewing and ob- 
serving users in situ 

All of the above methods are discussed in detail 
in many HCI- (human-computer interaction) design 
textbooks (e.g., Beyer & Holtzblatt, 1998; Nielsen 
& Mack, 1994). As when considering any technique 
or approach developed originally for the mainstream 
market, there are additional considerations that need 
to be borne in mind when adapting to designing for 
the whole population. 

When including people with motor impairments, it 
is useful to plan for the following: 



• Users may fatigue more quickly than the re- 
searcher expects. The user’s fatigue level 
should be carefully monitored and the researcher 
should be prepared to give frequent breaks, end 
a session early, and split tasks over multiple 
sessions if necessary. 

• Extra time should be allowed for computer- 
based activities. Allow 2 to 3 times as long for 
someone with a moderate impairment and be 
prepared for some individuals to spend longer. 

• For those whose disability has caused a speech 
impairment in addition to motor impairment 
(e.g., cerebral palsy), additional time for com- 
munication will be necessary, and the researcher 
may need to ask the user to repeat statements 
multiple times. Users are generally happy to do 
this in order to be understood. Researchers 
should also repeat responses back to the user to 
check that they have understood. In some 
cases, the user may choose to type responses 
into a document open on the computer. 

• Some users may have difficulty signing a con- 
sent form. Some may sign anXor use a stamp 
to sign, while others may wish to sign electroni- 
cally. Be prepared for all of these. 

• Users may prefer to respond to questionnaires 
verbally or electronically rather than use printed 
paper and pen. 

• Some physical disabilities have highly variable 
symptoms or may cause additional health prob- 
lems. Experimenters should expect higher-than- 
normal dropout rates, and be careful to confirm 
sessions near the time in case the participant is 
unable to attend. 



PACKAGING THE USER DATA 

Having discussed the issues that HCI researchers 
and practitioners have to consider when aiming to 
design for universal access, it is helpful to look at 
ways of packaging the user data in a succinct 
format. 

Presenting User Profiles 

There are a number of methods of packaging the 
user information for designers. For example, short 
videos of target users — perhaps depicting their 
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lifestyles, or using or talking about particular prod- 
ucts — provide designers with greater insights into 
the needs and aspirations of users. Such dynamic 
illustrations can be effective in inspiring designers to 
formulate inclusive solutions. They can also be very 
informative to designers who have never seen 
assistive technologies in use before. 

Such accounts offer immediate means of assess- 
ing a variety of ways and situations in which a 
product or service will be used or accessed. It can be 
a powerful technique if care is taken when building 
up user profiles based on actual user data or amal- 
gams of individual users constructed to represent the 
full range of target users and contexts of use. 

The Application of Statistical Analyses 

Statistical analyses are often useful ways to summa- 
rize and present quantitative data, but there are 
practical limitations to these techniques when in- 
cluding data from people with motor impairments. 
This is due to the variable availability of individual 
users, the small sample set, and considerable indi- 
vidual differences. It may be necessary to consider 
data gathered from individuals with motor impair- 
ments separately from other users. Because of the 
small number of users available, repeated measures 
designs should generally be employed. Obviously, 
these practical difficulties give rise to missing data 
problems resulting from incomplete conditions, caused 
by the loss of levels and factors from designs, and 
make the systematic varying of conditions in pilot 
studies difficult. 

In addition, the increased range and skewed 
variability resulting from the range of motor impair- 
ments leads to increased noise and violation of the 
assumptions of statistical tests. Where statistical 
tests are possible without violation of standard as- 
sumptions , such as normality of distribution or homo- 
geneity of variance, they should be carried out. 
However, even if the power of these experiments 
was unknown because of the reasons outlined and 
the small sample size, the effect sizes may still be 
large because of the sometimes radically different 
behaviours that are associated with different func- 
tional impairments. For this reason, some statistical 
results that do not appear significant should be 
analysed in terms of statistical power (1 — (3, the 
probability of rejecting a false null hypothesis; Cohen, 



1988) and estimates of the effect size given (Chin, 
2000 ). 

User Models 

Another method for packaging quantitative user 
information is a user model. In this context, a user 
model is a quantitative description of a user’s inter- 
action behaviour that can be used to describe, pre- 
dict, and/or simulate user performance on specific 
tasks. It has been used to model single-switch letter 
scanning and predict communication rates for scan- 
ning, and alternative and augmentative communica- 
tion (AAC) devices (Horstmann, 1990; Horstmann 
& Levine, 1991). There are critics of the applicabil- 
ity of such models to motion-impaired users (Newell, 
Arnott, & Waller, 1992; Stephanidis, 1999) who 
object to the use of generalizations for a population 
with such great individual differences. However, 
models representing a specific individual, or a group 
of relatively similar individuals, can help designers to 
understand the effects of their design decisions and 
refine their designs for improved usability. 

Claims Approach to Requirements 

Where quantitative data is not available, another 
method of packaging the user information is that of 
claims (Sutcliffe & Carroll, 1999). For example, if an 
on-screen button is hard to press, then a claim could 
be made that increasing the size of the button would 
make it easier to operate. The claim also identifies 
the user and situation for which it applies, recogniz- 
ing that there are often conflicting requirements that 
can lead to design compromises being sought. 

FUTURE TRENDS 

With antidiscrimination legislation being enacted by 
an increasing number of countries, designers are 
going to come under increasing pressure to ensure 
that all user interfaces, both hardware and software, 
are as accessible as possible. This means that in the 
future, designers will have to work more closely with 
users with all kinds of impairments, from vision and 
hearing to motor and cognitive. 

One of the most time-consuming aspects of 
working with motor-impaired users is finding and 
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recruiting them for the inevitable user trials required 
to ensure that the systems being developed meet 
their needs. One option that design teams may well 
begin to pursue is that of incorporating people with 
different impairment types into the design team 
itself. This approach offers the immediate advan- 
tage that detailed feedback on the effect of design 
choices on a system’s accessibility can be deter- 
mined rapidly. A potential further step is to train 
those people to actively drive the design process, 
taking their needs into consideration from the very 
outset. With many companies having to meet em- 
ployee quota targets under disability legislation, this 
approach should become an increasingly attractive 
proposition to many design teams. 

CONCLUSION 

Including people with motor impairments in the 
design and evaluation of computer products is es- 
sential if those products are to be usable by this 
population. There are some special considerations 
and techniques for including this population in user 
studies. Researchers may need to expend some 
effort locating appropriate users. It is good practice 
to perform capability assessment to identify edge- 
case individuals who should, in principle, be able to 
use the product, but may be excluded by specific 
design features. Study materials and methodologies 
may need to be modified to meet the needs of users. 
Laboratories, user environments, and remote testing 
can all be used, although testing in the user’s envi- 
ronment is preferred whenever possible. The statis- 
tical analysis of user data is possible in specific 
circumstances, and significant effects can be found 
even with small sample sizes, but care must be taken 
to use tests that do not rely on inappropriate assump- 
tions. User data can be presented to designers 
quantitatively, as statistical summaries or user mod- 
els, or qualitatively, as user profiles or claims. 
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KEY TERMS 

Accessibility: A characteristic of information 
technology that allows it to be used by people with 
different abilities. In more general terms, accessibil- 



ity refers to the ability of people with disabilities to 
access public and private spaces. 

Accessible Technology: Products, devices, or 
equipment that can be used, with or without assistive 
technology, by individuals with disabilities. 

Assistive Technology: Products, devices, or 
equipment, whether acquired commercially, modi- 
fied, or customized, that are used to maintain, in- 
crease, or improve the functional capabilities of 
individuals with disabilities. 

Inclusive Design: The design of mainstream 
products and/or services that are accessible to, and 
usable by, as many people as reasonably possible on 
a global basis, in a wide variety of situations, and to 
the greatest extent possible without the need for 
special adaptation or specialized design. 

Motor Impairment: A problem in body motor 
function or structure such as significant deviation or 
loss. 

User-Centered Design: A method for design- 
ing ease of use into a product by involving end users 
at every stage of design and development. 

User Model: A quantitative description of a 
user’s interaction behaviour that can be used to 
describe, predict, and/or simulate user performance 
on specific tasks. 
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INTRODUCTION 

The soul is divided into an immortal part, located 
in the head, and a mortal part, distributed over 
the body. Philosophical and intellectual loves of 
beauty are located in the immortal soul. Other 
“regular” emotions are located in the mortal 
soul. (Plato as cited in Koolhaas, 2001) 

Emotion is one of the lovely gifts from nature. It 
is present not only in humans, but most species 
present sorts of emotions and expressions in daily 
behaviors. However, only human beings ask for 
explanations. Research into the mystery of emotion 
can be traced back to Heraclitus (500 BC), who 
claimed that “the emotional state is characterized by 
a mixture of body parameters such as temperature 
(hot/cold) and sweat amount (wet/dry)” (as cited in 
Koolhaas, 2001). 

In the 21 st century, technology has achieved a 
standard that Plato never dreamed about, but emo- 
tion is still an unsolved question. Although science 
needs more time to work out the mechanism, it does 
not keep emotion out of human communication. 

With the commercial success of the Internet, 
more people spend their time with their box: the 
computer. Designing an attractive user interface is 
not only the objective of every software developer 
but also is crucial to the success of the product. 
Methods and guidelines (Newman & Lamming, 
1995) have been published to design a “vivid” user 
interface. One of the most important methods is to 
add expressive images in the display (Marcus, 2003). 
For example, when a user finishes some operation, 
an emotional icon or emoticon (an industry term 



introduced in the 1980s by Meira Blattner) will pop 
up to communicate “well done” to the user. 

Two widely accepted methods exist for display- 
ing emotional feelings in software interfaces. One is 
the use of emotion-oriented icons; the other is using 
complex images, for example, a cartoon or a facial 
image (Boucouvalas, Xu, & John, 2003; Ekman, 
1982). 

Emotion icons cannot communicate complex feel- 
ings, and they are not usually customized. As the 
industry matures, perhaps emoticons will be re- 
placed by expressive images as sophisticated as the 
computer-generated Golem of The Lord of the 
Rings movie fame. 

Expressive images present emotional feelings to 
users. What internal factors (e.g., image intensity or 
people’s mood) may influence the perceived emo- 
tional feelings? Will external factors (e.g., display 
duration) influence the perceived emotional feelings 
as well? 

In this article, we are particularly interested in 
discussing the factors that may influence the per- 
ceived emotional feelings. Our conclusions are based 
on the findings from a series of experiments that 
demonstrate an empirical link between the level of 
expressive-image intensities and the perceived feel- 
ings. The detected factors include the following: 

• Expression intensity 

• Wear-down effect (display duration effect) 

The test results demonstrate that increasing the 
expressive-image intensity can improve the per- 
ceived emotional feeling. However, when the inten- 
sity is increased to an extreme level, the perceived 
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emotional feelings fall. The experiment results also 
indicate that the perceived emotional feelings are not 
affected by the length of time that users are exposed 
to the expressive images. 

BACKGROUND 

Emotion is not a concept that can be easily defined. 
Izard (1993) describes emotion as a set of motiva- 
tional processes that influence cognition and action. 
Other researchers such as Zajonc (1980) argue that 
emotion is a particular feeling, a quality of conscious 
awareness, and a way of responding. 

A widely accepted fact about emotion is that 
emotion can be classified into different categories 
and numerous intensities. One classification method 
divides emotions into elation, desire, hope, sadness, 
anger, frustration, and so forth (Koolhaas, 2001). 

Emotional expressions not only present one’s 
internal feelings, but also influence interpersonal 
feelings. Moffat and Frijda (1994) demonstrated 
that expressions are a means to influence others. 
Fridlund (1997) found that expressions occur most 
often during pivotal points in social interactions: 
during greetings, social crises, or times of appease- 
ment. According to Azar (2000), “Thinking of facial 
expressions as tools for influencing social interac- 
tions provides an opportunity to begin predicting when 
certain facial expressions will occur and will allow 
more precise theories about social interactions” (p. 
45). 

The influences of emotions on the public domain 
have been examined for many years. Emotion is a 
powerful tool for reporters, editors, and politicians. 
The 9/11 New York attack may not have been 
experienced by all personally; however, most of us 
felt the same fear and pain when we saw the scenes. 
Strong links between the emotion of news and the 
importance individuals assign to issues have been 
suggested by a number of theories (Evatt, 1997). 

Is emotion an important tool online as in daily life? 
Recent research argues that there is in fact a high 
degree of socioemotional content observed in com- 
puter-mediated communications (CMC; McCormick 
& McCormick, 1992; Rheingold, 1994), even in 
organizational and task-oriented settings (Lea & 
Spears, 1991). Even first-time users form impres- 
sions of other communicant’s dispositions and per- 



sonalities based on their communication style (Lea 
& Spears, 1991). 

Multimodal presentations (e.g., animation, voice, 
and movie clips) for Internet communication are 
more popular than ever as the processing speed and 
bandwidth continues increasing. These new presen- 
tation styles make emotion expression easier to 
transmit than before. 

Will users prefer emotional feelings to be pre- 
sented pictorially on the computer interfaces? Will 
the expressive images influence the perceived feel- 
ings? 

We have carried out a series of experiments to 
investigate these questions (Xu & Boucouvalas, 
2002; Xu, John, & Boucouvalas, in press). 

Xu and Boucouvalas (2002) demonstrated an 
effectiveness experiment. In that experiment, par- 
ticipants were asked to view three interfaces (an 
interface with an expressive image, voice, and text; 
an interface with an expressive image and text; and 
an interface with text only). The results show that 
most participants prefer the interface with the ex- 
pressive image, voice, and text. A significant num- 
ber of participants preferred the interface with the 
expressive image, voice, and text much more than 
the text-only interface. This means that with the 
expressive images, the effectiveness of the human- 
computer interface can be considerably improved. 

Xu et al. (in press) presented a perceived-perfor- 
mance experiment, which demonstrated that emo- 
tion can affect the perceived performance of indi- 
viduals. In that experiment, participants were asked 
to answer questions in an online quiz. A computer 
agent presented 10 questions (e.g., “What percent- 
age of people wear contact lenses?” and choices A, 
15%; B, 30%; C, 20%; D, 50%) to the participants. 
When the participants finished answering the ques- 
tions, either the presenting agent himself (self-as- 
sessing) or a new agent checked the participants’ 
answers (other-assessing). No matter what an- 
swers each participant provided, all were told that 
they answered the same 5 out of 10 questions 
correctly. For the other-assessing scenario, the as- 
sessing agent presented no emotional expressions 
positively related to participants’ answers or emo- 
tional expressions negatively related to participants’ 
answers. The results from the other-assessing sce- 
nario demonstrated that significant differences exist 
when comparing the positively-related-emotion situ- 
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ation with the negatively-related-emotion situation 
and the no-emotion situation. The participants in the 
positively-related-emotion situation believed that they 
achieved much better performances than the partici- 
pants in the other situations. 

FACTORS THAT MAY INFLUENCE 
EMOTION PRESENTATION 

Are the influences found in the former experiments 
permanent or changeable in different situations? Will 
other factors affect the influence of emotion expres- 
sion? Evatt (1997) discovered that the perceived 
salience of a public-policy issue will increase when 
news about the issue is presented in a highly emotion- 
evoking manner and decrease when the news about 
the same issue is presented in a less emotion-evoking 
manner. This demonstrates that when the intensity of 
the emotion-evoking manner increases, the salience 
the readers perceive will increase. 

Hovland, Janis, and Kelley (1953) suggested that 
increasing the intensity of the emotion-evoking con- 
tent might not always elevate salience. At a certain 
intensity level, the effect could start to drop off. 

Hughes (1992) and Kinnick, Drugman, and 
Cameron (1996) demonstrated a wear-down phe- 
nomenon. Participants’ favourable responses will be 
reduced after the emotion-evoking manner is re- 
peated over a long period of time. As before, the 
measurement was based on pure text. However, 
Evatt (1997) demonstrated that the wear-down phe- 
nomenon is not always observed. 

It can be seen that the influences of presenting 
textual information in an emotion-evoking manner 
will be affected by different factors. However, the 
above experiments are purely based on textual mes- 
sages, which mean that the emotion-evoking man- 
ners used were pure text. Will the emotions pre- 
sented by expressive images produce the same phe- 
nomena? 



EMOTIONAL-INTERFACE DESIGN 
CONSIDERATIONS 

In response to the previous experiments and back- 
ground knowledge, it is necessary to identify the 
possible factors that may influence the perception of 



emotion over the Internet. In summary, three phe- 
nomena were observed. 

• When the intensity of the expressive images 
increases, the perceived emotional feelings 
will increase. This phenomenon means that by 
increasing the intensity of the emotion-evok- 
ing manner (expressive images), the perceived 
feelings will increase even if the accompanied 
text remained the same. 

• When the intensity of the expressive images 
rises beyond a realistic level, the perceived 
feelings will stop increasing and start to de- 
crease. The levels of emotional feelings were 
predicted to fall as the participants were ex- 
posed to an extremely high intensity of the 
expressive images. 

• After viewing three scenarios, the perceived 
feelings for the third scenario will be higher for 
people who view a scenario accompanied with 
medium-intensity expressive images follow- 
ing two scenarios accompanied without ex- 
pressive images than for people who view the 
same scenarios each accompanied with me- 
dium-intensity expressive images (wear-down 
effect). 

The above three phenomena have been ob- 
served by various researchers (see above discus- 
sion); however, some researchers doubt the exist- 
ence of the phenomena, especially the wear-down 
effect. In this article, we developed two experi- 
ments to assess the applicability of the above phe- 
nomena. 



THE INTENSITY EXPERIMENT 

To assess the influences of expressive images with 
different intensities, a human-like agent was devel- 
oped. The agent presented a story on the screen and 
offered facial expressions. To focus on the influ- 
ences of expressive facial images, the story itself 
contained minimal emotional content. 

Sixty students and staff from Bournemouth 
University participated in this online experiment. 
The experiment included two sessions. First, a 
cartoon human-faced agent presented a story to 
each participant. In the second section, participants 
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answered a questionnaire about the emotional feel- 
ings they perceived. 

A between-group experimental design was ap- 
plied for this experiment. In all conditions, the agent 
presented the same stories to every participant. 
However, in the first condition (low-intensity condi- 
tion), the agent presented facial images with low 
expressive intensity to the participants. In the sec- 
ond condition (medium-intensity condition), the agent 
presented facial images with medium expressive 
intensity. In the third condition, the agent presented 
extreme-expressive-intensity facial images to par- 
ticipants (extreme-intensity condition). Typical 
screens of all conditions are shown in Figure 1 . 

After viewing the story presentation session, all 
participants in the three groups were directed to the 
same questionnaire. The applied questionnaire was 
based on the Personal Involvement Inventory (PII) 
that was developed by Zaichkowsky (1986). 

INTENSITY TEST RESULTS 

The Shapiro-Wilks normality test (Norusis, 1998) 
was carried out and the result indicated that the 
observations of the emotion-intensity test were nor- 
mally distributed, and therefore t-tests were carried 
out to determine whether the ratings of participants 
who viewed the different conditions were signifi- 
cantly different. 

For the low-intensity condition, the mean value of 
the perceived emotional feeling was 3.22. In the 
medium-intensity condition, the mean perceived 
emotional feeling was 4.6. The t-test revealed a 



significant difference between the ratings of the 
low-intensity condition and the medium condition 
(F=3.85,p=0.044). 

We were therefore able to accept the first phe- 
nomenon that states when the intensity of the emo- 
tionally expressive images increases, the perceived 
emotional feelings will increase as well. 

For the extreme-intensity condition, the mean 
perceived emotional feeling was 3.7. The t-test 
showed a marginally significant difference between 
the ratings of the medium condition and the high 
condition (F=4.25, p=0.08). The result indicates that 
the second phenomenon is correct in asserting that 
when the intensity of the emotional-expression im- 
ages rises beyond a realistic level, the perceived 
feelings will stop increasing and may fall. 

THE WEAR-DOWN-FACTOR (THIRD 
PHENOMENON) EXPERIMENT 

Will external factors influence the perceived emo- 
tional feelings? An experiment was carried out to 
test an external factor: wear-down. Wear-down (or 
wear-out in different literature) is described by 
Hughes (1992) as a reduction in the participant’s 
favourable responses after repeated exposure to a 
message. For example, when an individual first 
meets an exciting stimulus, the excited feelings will 
be high. When the stimulus is repeated many times, 
the exciting feelings will not continue to rise; instead, 
the feelings will be stable or even fall if the stimulus 
is endless. The problem with assessing wear-down 
factors is that it is hard to predict the exact time that 



Figure 1. Typical screens of the three conditions of the experiment 
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Figure 2. The wear-down factor 




feelings will be stable or will fall. The wear-down 
factor is illustrated visually in Figure 2. 

To assess the wear-down factor, we still relied 
on the agent-presenting environment. Flowever, in- 
stead of presenting only one story, the agent pre- 
sented three stories, all of which contained minimal 
emotional content in order to keep the focus on the 
expressive images. 

Forty students and staff from Bournemouth Uni- 
versity participated in this online experiment. The 
participants were divided into two groups. The car- 
toon human-faced agent presented three stories to 
each participant and then the participants answered 
a questionnaire about the perceived emotional feel- 
ings. 



A between-group experimental design was ap- 
plied. The stories were arranged in the same subject 
order and all the stories contained minimal emotional 
content themselves. The presentations to the two 
groups differed only in the intensity of the expressive 
facial images. In the first condition, the agent pre- 
sented two stories without facial expressions fol- 
lowed by a story with medium-intensity facial ex- 
pressions. In the second condition, the agent pre- 
sented all three stories with medium-intensity ex- 
pressions. The typical screens of Group 1 and Group 
2 are shown in Figure 3. 

After viewing the story presentation session, all 
participants in both groups were directed to the same 
questionnaire session. The applied questionnaire 
was also based on the Personal Involvement Inven- 
tory developed by Zaichkowsky (1986). 

Although the story sets were the same, the third 
phenomenon predicted that the perceived emotional 
feelings would be higher when participants viewed 
medium expressive images after two sets of neutral 
expressive images. The third phenomenon is only 
concerned with the responses to the third story in 
each set. The design for each set of story presenta- 
tions is shown in Table 1 . 



Figure 3. Typical screens of both conditions of the experiment 




328 



The Influence of Expressive Images for Computer Interaction 



Table 1. Description of research protocol to test third phenomenon 



Participants 


Expressive-Image Level 


Group 1 


None 


None 


Medium 


Group 2 


Medium 


Medium 


Medium 



Table 2. Results of tests of phenomena 



Phenomenon 


Description of Test 


Supported 


1 


Comparison of medium and low expressive 
images 


Yes 


2 


Comparison of extreme and medium expressive 
images 


Yes 


3 


Testing the wear-down effect 


No 



WEAR-DOWN-FACTOR EXPERIMENT 
RESULTS 

First the Shapiro-Wilks normality test was carried 
out, which indicated that the observations of the 
wear-down-factor test were normally distributed. 
Therefore, t-tests were carried out to determine 
whether the results of the two groups of the partici- 
pants were significantly different. 

For the none-none-medium condition, the mean 
value of the perceived emotional feeling for Story 3 
was 4.4. In the medium-medium-medium condition, 
the mean perceived emotional feeling was 4.7. The 
t-test revealed no significant difference between the 
ratings of the two groups. Thus, the third phenom- 
enon was not supported by the test result. 

DISCUSSION 

A summary of the experiment results is presented in 
Table 2. 

The experiment results support the first and 
second phenomena that predicted that the perceived 
emotional feelings from textual stories are strength- 
ened when the story is accompanied with suitable 
expressive images. 

It was first expected that when the agent pre- 
sents a story with suitable expressions, the per- 
ceived emotional feelings will increase. As pre- 
dicted, the participants who read the story with 



medium-expressive-intensity images perceive more 
emotional feelings than the participants who read the 
story with low-intensity images. 

The next question is whether a ceiling exists. Will 
the gain achieved from increasing expression inten- 
sity be lost when the intensity reaches an unrealistic 
level? We predicted that the extremely high-inten- 
sity expressive images may decrease the perceived 
emotional feelings. The experiment result partially 
supported this phenomenon as a marginally signifi- 
cant difference between the two conditions was 
found. 

Participants reported a significantly higher level 
of emotional feelings in the medium-intensity condi- 
tion than in the low-intensity condition. When the 
expressive facial images are exaggerated to an 
unrealistic level, the perceived emotional feelings 
start to decrease. 

The third phenomenon states that the influence 
of external factors, such as the wear-down effect, 
would affect the perceived emotional feelings. How- 
ever, the results show that the perceived emotional 
feelings remain stable. This may suggest that the 
perceived emotional feelings are independent of 
external factors, in particular, the number of times 
expressive images are displayed. That is, the effect 
that an expressive image has on the viewer is not 
changed whether the viewer has seen the image 
many times or whether it is the first time it has been 
displayed. Another explanation is that the compared 
data are both within the stable phase, and we should 
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keep the display longer to move to the wear-down 
phase. 

EXPERIMENT IMPLICATIONS 

The experiments indicate that emotion intensity has 
an influence on perceived feelings. It also indicates 
that external factors such as the wear-down effect 
do not influence perception. The influences of emo- 
tion intensity are consistent. 

Some practical implications can be drawn for the 
design of affective computer interactions. 

• To convey emotional feelings, expressive im- 
ages should be provided during human-com- 
puter interaction. 

• Medium-intensity expressive images can 
achieve the best performance. Both decreas- 
ing the intensity and increasing the intensity to 
an extreme level will show negative influences 
to the perceived feelings. 

• The perceived emotional feelings may be inde- 
pendent of factors such as display duration, or 
the stable phase is long. This means that the 
expressive images can be shown as soon as 
appropriate, and there is no need to worry that 
the perceived emotional feelings may decrease 
with reasonable repeated use. 

FUTURE TRENDS 

Future trends in emotion systems are to create 
emotion-aware systems: systems that are aware of 
the social and emotional state of the users in deter- 
mining the development of the interaction process. 
The development of systems interacting with and 
supporting the user in his or her tasks must consider 
important factors (e.g., expression intensity and 
timing) that may influence the perceived feelings. 

This work presents the guidelines for displaying 
expressive images in a specific context. Flowever, 
the result could be applicable to other contexts (e.g., 
emotional agents and human-human interface de- 
sign). Further experiments can verify the results in 
other contexts and examine other factors (e.g., 
personality, application context, colours, etc.) that 
may influence expressive-image presentation. Then, 



a full and clear guideline of expressive-image pre- 
sentation may be established. 

CONCLUSION 

A set of experiments that tested the factors that may 
influence perceived emotional feelings when users 
interact with emotional agents was conducted. 

The experiment results demonstrated that inter- 
nal factors such as the intensity of expressive im- 
ages do significantly influence the perception of 
emotional feelings. The perceived emotional feel- 
ings do increase when the intensity of an expressive 
image increases. However, when the intensity of 
expressive images increases to an unrealistic level, 
the perceived emotional feelings will fall. The ex- 
periment examined external factors, such as the 
display time (wear-down effect), and found this 
does not produce a significant difference. It thus 
shows that the wear-down effect does not influence 
the perceived emotional feelings significantly. 

The research indicates that expressive images do 
influence the perceived emotional feelings, and as 
long as the display is valid, the appropriate expres- 
sive images can be shown. There is either no de- 
crease in perceived emotional feelings due to a 
wear-down effect, or the stable phase of the wear- 
down effect is very long. 
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KEY TERMS 

Emotion: An excitement of the feelings caused 
by a specific exciting stimulus and manifested by 
some sensible effect on the body. 

Emotion-Evoking Manner: The methods to 
make readers perceive emotions. 

Emotion Icon: A combination of keyboard char- 
acters or small images meant to represent a facial 
expression. 

Emotional Communication: The activity of 
communicating emotional feelings. 

Personal Involvement Inventory: A mea- 
surement questionnaire developed by Zaichkowsky 
(1986). 

Software Agent: A computer program that car- 
ries out tasks on behalf of another entity. 

Wear-Down Effect: A reduction in the 
participant’s favourable responses after repeated 
exposures to a message. 
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INTRODUCTION 

HCI might well be poised to break out of its mould, 
as defined by its first half-century history, and to 
redefine itself in another mould that is at once more 
abstract and wider in scope. In the process, it would 
redefine its very name, HCI becoming a subset of 
the larger field of information interaction (II). This 
potential transformation is what is described here. 

At this point in our technological era, we are in 
the process of symbolically modeling all aspects of 
reality such that our interactions with those aspects 
of the world around us that are most important are 
more digitally mediated. We are beginning to inhabit 
information environments and to interact ever more 
with artifacts, events, and processes that are pure 
information. This is the world of II, and what this 
means for HCI is what is examined here. 

The presentation has a largely abstract character 
to it. Indeed, it seeks to reframe our discussion of the 
phenomenon of interaction under study in such a 
way as to go beyond the pitfalls of concrete prob- 
lems usually associated with the field. By stepping 
back from the usual issues of concern and from the 
usual way of categorizing the elements of the field 
(Helander et al., 2000; Jacko & Sears, 2003), the 
goal is to contextualize HCI within a broader, neces- 
sarily philosophical plane of concern in order to look 
at it afresh and thereby see where it might be 
headed. The direction proposed is decidedly more 
englobing, more abstract, and, hence, more theoreti- 
cal in its analysis. 

BACKGROUND 

HCI is a field that grew out of the expansion of 
computing beyond the early context of usage by 
technically inclined specialists, who were quite ea- 
ger to access the potential of computing and did not 
mind the learning curve involved. The scope of HCI 
continues to expand, as computing becomes ever 



more pervasive and novice users expect to use 
computing artifacts without fuss, to put it bluntly. 
Thus, the goal of HCI is to ease usage while preserv- 
ing the power of the artifact, effecting whatever 
compromises are possible in order to achieve a 
workable solution. That this goal is difficult not only 
to achieve but even to have accepted is well illus- 
trated by Carroll’s (1990, 1998) proposal for 
minimalism and by Norman’s (1998) proposal for 
information appliances, building on the notion initially 
proposed by Raskin (see Norman). 

So we continue to indulge in situations where 
complex system requirements are specified and 
HCI expertise is brought in to do what it may to 
perhaps ameliorate the situation somewhat. At- 
tempts to break out of this design context (as through 
the various means presented in section II of the 
Handbook of HCI [Helander et al., 2000]) certainly 
point the way but may only succeed when computing 
itself is seen to disappear (in the spirit of W eiser and 
Brown’s [1997] ubiquitous computing and Norman’s 
[1998] “invisible” computer) into the larger context 
of human activity structures. Thus, how we view 
cognitive tasks is central to HCI past, present, and 
future, and needs to be considered in a high-level 
framework, as described next. 

The most basic question of HCI is what the 
interaction is between. The three elements generally 
involved in the answer are the person (user), the 
system (computer and its interface), and the task 
(goal). An answer with more guts or more ambition 
would do away with the middle element and pursue 
analysis purely in terms of person and task. Doing 
away with the interface itself is, after all, the ulti- 
mate in the quest of transparency that drives all HCI 
design. 

A computer system, represented to the person by 
its interface, is an artifact that mediates some spe- 
cific process (i.e. , supports the interfacing between 
person and task such that the person can realize the 
task). The person does not care about the interface 
(it is just a tool) but does care a great deal about the 
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task. Transparency in HCI means forgetting about 
the interface. 

Ubiquitous computing (Weiser & Brown, 1997) 
shares that same goal of transparency, although with 
a focus on having computers embedded everywhere 
within the environment. Here, the attention is not on 
computing itself (even if it is pervasive) but on 
accomplishing a task (i.e. , interacting with the envi- 
ronment and more specifically with the information 
present in the environment). 

A good example of transparency from a more 
familiar domain (Duchastel, 1996) is the steering 
wheel in a car. The steering wheel is the interface 
between oneself and the road (I never think about 
the steering wheel, but I observe the bends in the 
road). The steering wheel disappears, as an ideal 
interface should, and all that is left is the road and me 
(the task and the person). 

A second aspect of the new HCI concerns 
interaction modalities and their concrete elements. 
Just as command modalities gave way to the WIMP 
paradigm of contemporary interfaces (Pew, 2003), 
the latter will give way to yet more natural interfaces 
involving speech and immersive technologies in the 
VR realm (see the following). The driver of this 
shift, beyond the developing feasibility of these 
technologies, is the HCI goal of adapting to humans 
through use of natural environmental settings (i.e., 
another facet of the transparency goal). The day 
when my interface will be an earpiece, lapel button, 
and ring (the button for sensory input of various 
kinds and for projection; the ring as a gestural 
device) may not be far off. Screens and wraparound 
glasses will be specialty devices, and keyboards and 
mice will be endangered species. 

These evolutions (of process and gear) will make 
the person see computing as interfacing, with cur- 
rent gear long forgotten and the computer, while 
ubiquitous, nevertheless invisible. The disappearing 
computer will not leave great empty spaces, how- 
ever. There will be agents to interact with (discussed 
later) and novel forms of interaction, discussed here. 

The new landscapes include application areas 
such as communication, education, entertainment, 
and so forth (Shneiderman, 2003). They all involve 
interaction with information but also add to the mix 
the social aspect of interaction, thus creating a new 
and more complex cognitive context of action. The 
backdrop for HCI has changed suddenly, and the 



cognitive context has evolved to a sociocognitive 
one, as illustrated by the current interest in CSCW, 
itself only part of the new landscape. 

The notion of interface can be reexamined (Carroll, 
2003; Shneiderman, 2003). In a very broad definition 
(Duchastel, 1996), an interface can be considered as 
the locus of interaction between person and environ- 
ment; more specifically, the information environ- 
ment within which the person is inserted. In these 
general terms, interfaces can be viewed as abstract 
cognitive artifacts that constrain or direct the inter- 
action between a person and that person’s environ- 
ment. In the end, the task itself is an interface, one 
that connects actor to goal through a structured 
process. Even the most archaic software is the 
concrete embodiment of a task structure. Thus, on 
the one hand, HCI deals with the person-information 
relation and is concerned with the design of informa- 
tion products; and on the other hand, it deals with the 
person-task relation and here is concerned with the 
guidance of process. It is the interplay between 
these two facets (product and process) that creates 
the richness of HCI as an applied field of the social 
sciences. 



IMPLICATIONS FOR HCI 

The constant novelty factor that we experience with 
technology generally and with computing in particu- 
lar sets us up for fully using our intelligence to adapt. 
Not only do the tools (interfaces) change but so do 
the tasks and activities themselves, as witnessed, for 
instance, by the arrival of Web browsing and many 
other Web tasks. In this respect, then, HCI is faced 
with a losing battle with mounting diversity and 
complexity, and can only purport to alleviate some of 
the strain involved with these needs for humans to 
adapt. What has happened to HCI as the process of 
adapting computers to humans? HCI must find ways 
to assist human adaptation with general means, such 
as only gradually increasing the complexity of an 
artifact, forcing stability in contexts that may prove 
otherwise unmanageable, increasing monitoring of 
the user, and just-in-time learning support. All of 
these means are merely illustrative of a style of HCI 
design effort of which we likely will see more and 
more in response to computing complexity. 
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In reality, it is complexity of activity that has 
increased, not complexity of computing itself. Cars 
and telephones also have required adaptability for 
optimum usage recently. But as computing pen- 
etrates all areas more fully, and as the possibilities for 
more symbolic mediacy increase (e.g., the choices 
on the telephone now), the question to ask is how can 
HCI help? Are there general principles that can be 
applied? Perhaps not, for what we are witnessing 
here is the removal of the C from HCI. As computing 
becomes pervasive, it indeed disappears, as sug- 
gested earlier by Weiser and Brown (1997), and it is 
replaced by human-task interaction. Attention shifts 
up the scale of abstraction, and designers focus on 
task structure and context (Kyng & Mathiassen, 
1997 ; Winograd, 1997) more than on operational task 
mediators, even though somewhere along the line, 
hard tool design is needed. A more human-focused 
HCI (away from the software, more toward the 
experience) evolves. 

FUTURE TRENDS 

Computer agents in the form of software that carries 
out specialized tasks for a user, such as handling 
one’ s telephoning, or in the form of softbots that seek 
out information and prepare transactions, are already 
well with us (Bradshaw, 1997). That their numbers 
and functions will grow seems quite natural, given 
their usefulness in an ever more digitized and net- 
worked world. 

What will grow out of the agent phenomenon, 
however, has the potential to radically transform the 
context of our interactions, both digital and not, and, 
hence, the purview and nature of HCI. The natural 
evolution of the field of agent technology (Jennings & 
Wooldridge, 1998) leads to the creation, deployment, 
and adaptation of autonomous agents ( AAs) (Luck et 
al., 2003; Sycara & Wooldridge, 1998). These agents 
are expected to operate (i.e., make reasoned deci- 
sions) on behalf of their owners in the absence of full 
or constant supervision. What is at play here is the 
autonomy of the agent, the degree of decision-mak- 
ing control invested in it by the owner, within the 
contextual limits imposed by the owner for the task at 
hand and within the natural limits of the software. 

Seen from another perspective, the computer user 
removes himself or herself to an extent from the 



computer interactions that will unfold, knowing that 
the agent will take care of them appropriately and 
in the user’ s best interest. We witness here a limited 
removal of the human (the H) from HCI. 

All this is relative, of course. Current stock 
management programs that activate a sale when 
given market conditions prevail already operate 
with a certain level of autonomy, as do process 
control programs that monitor and act upon indus- 
trial processes. Autonomy will largely increase, 
however, as we invest agents with abilities to learn 
(i.e., agents that learn a user’s personal tastes from 
observation of choices made by the user) and to use 
knowledge appropriately within limited domains. 
As we also develop in agents the ability to evolve 
adaptation (from the research strand known as 
artificial life) (Adami & Wilke, 2004), we will be 
reaching out to an agent world where growing 
(albeit specialized) autonomy may be the rule. HCI 
will be complemented with AAI (Autonomous Agent 
Interaction), for these agents will become partici- 
pants in the digital world just as we are, learning 
about one another through their autonomous inter- 
actions (Williams, 2004). 

As we populate digital space with agents that 
are more autonomous, we create an environment 
that takes on a life of its own in the sense that we 
create uncertainty and open interaction up to ad- 
venture in a true social context. Not only will people 
have to learn how to react to the agents that they 
encounter, the latter also will have to react to people 
and to other autonomous agents (Glass & Grosz, 
2003). The interfacing involved in this novel cogni- 
tive context is changing radically from its traditional 
meaning, with issues of understanding, trust, initia- 
tive, and influence coming to the fore. In discussing 
agents in the future of interfaces, Gentner and 
Nielsen (1996) talk of a shared world in which the 
user’s environment no longer will be completely 
stable, and the user no longer will be totally in 
control; and they were talking of one’ s own assistive 
agents, not those of other people or of autonomous 
agents. The change occurring in HCI is merely 
reflecting the changing environment at large. 

Perhaps an easy way to grasp what might be 
involved is to consider avatar interaction in VR 
worlds. Avatars are interfaces to other humans 
involved in a social interaction. Just as with authen- 
tic settings in which they mingle, humans in virtual 
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settings must learn something about the others in- 
volved and learn to compose with them harmoni- 
ously in the accomplishment of their goals. The 
important consideration in this situation is that while 
the VR world may be artificial and may be experi- 
enced vicariously in physical terms, in psychological 
terms, the VR world can be just as genuine as the 
real world, as hinted by Turkle’s (1995) interviews 
with digital world inhabitants (e.g. , real life is j ust one 
more window). Interagent communication, just like 
its interpersonal counterpart, will be improvised and 
creative, with codes and norms emerging from the 
froth of the marketplace (Biocca & Levy, 1995). 
The potential for enhancing interaction certainly 
exists, particularly within VR worlds that not only 
reproduce but extend features of our regular world; 
new risks also appear, for instance, in the form of 
misrepresentation of agent intentions or outright 
deception (again, just as can occur in our normal 
interpersonal context) (Palmer, 1995). 

The point is that the new cognitive context that is 
being created by both VR worlds and autonomous 
agents roaming cyberspace, all of which are but 
software artifacts, changes how we view interacting 
with computers. There still will exist the typical 
applications for assisting us in accomplishing spe- 
cific creative tasks (and the associated HCI chal- 
lenges), but the greater part of our interfacing with 
digital artifacts more generally will resemble our 
interfacing with others in our social world. In addi- 
tion, interfacing specialists will be as concerned with 
the interface between AAs as with the interface 
between them and humans. 



CONCLUSION 

I foresee nothing short of a redefinition of the field, 
with classic HCI becoming a subset of a much 
wider-scoped field. This expansion shifts the focus 
of interfacing away from its traditional moorings in 
functionality and onto new landscapes that are much 
more sociocognitive in nature. The wider, more 
abstract notion of an interface being the locus of 
interaction between a person and his or her environ- 
ment leads us to define the field in terms of informa- 
tion interaction (II). Indeed, the environment that a 
person inhabits is ever more symbolically and digi- 
tally mediated. While psychology broadly defines 



that interaction in general terms, II defines it in 
symbolic terms. Information constantly gleaned from 
the environment regulates our actions, which, in turn, 
are increasingly effected through information. We 
enter the age of interaction design (Preece et al., 
2002; Winograd, 1997) and environment design 
(Pearce 1997). 

This is particularly evident as we not only design 
interactions with information but also come to inhabit 
environments that are pure information (as VR 
worlds are). The added complexity resulting from 
the growth in autonomous agents potentially makes 
II all the more challenging, bringing, so to speak, a 
level of politics into what was hitherto a fairly 
individual and somewhat straightforward interac- 
tion. Agents can be both autonomous cognitive 
artifacts and assistive interfaces, depending on their 
design specifics. 

Donald (1991) shows how cognitive inventions 
have led to cultural transitions in the evolution of the 
human mind and specifically how the invention of 
external memory devices, in expanding our natural 
biological memories, has fueled the modern age, 
leading us to digital realms. Autonomous agents lead 
us beyond out-of-the-skin memories to out-of-the- 
skin actions via the delegation with which we invest 
our assistive agents. The implications of this possi- 
bility are immense, even if only perceived hazily at 
this moment. 

In sum, in the next few decades, HCI will trans- 
form itself into a much wider and more complex field 
based on information interaction. HCI will become a 
subset of the new field alongside AAI, dealing with 
interaction between autonomous agents. The new 
field will parallel the concerns of our own human- 
human interactions and thus involve social concerns 
alongside cognitive concerns. 
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KEY TERMS 

Agent and Autonomous Agent: Software that 
carries out specialized tasks for a user. Agents 
operate on behalf of their owners in the absence of 
full or constant supervision. Autonomous agents 
have a greater degree of decision-making control 
invested in them by their owners. 

Artificial Life: The reproduction in digital mod- 
els of certain aspects of organic life, particularly the 
ability of evolving adaptation through mutations that 
provide a better fit to the environment. In informa- 
tion sciences, artificial life is not concerned with the 
physico-chemical recreation of life. 

Avatars: Computer-generated personas that are 
adopted by users to interface with other humans and 
agents involved in a social interaction, particularly in 
interacting in online virtual reality (VR) worlds. 

Cognitive Artifacts: A class of designed ob- 
jects that either can be considered in its concrete 
representations (interfaces, agents, software) or in 
its abstract mode as knowledge artifacts 
(contextualized functions, ideas, theories) (Duchastel, 
2002 ). 

Cognitive Task: An intellectual task, as op- 
posed to a physical one. The range of such tasks has 



increased to the point that computers are involved 
more with communication than with computing; no 
longer do people only use computers to transact 
specific processes, but they also use them to stroll 
within new landscapes, as on the Web. 

Information Interaction: The wider, more ab- 
stract notion of an interface, seen as the locus of 
interaction between a person and his or her environ- 
ment. As that environment is ever more symbolically 
and digitally mediated, we are led to define more 
broadly the field in terms of information interaction 
(Duchastel, 2002). 

Interface: A surface-level representation with 
which a user interacts in order to use a piece of 
equipment or a software application with a view to 
engage in some purposeful task. The purpose of an 
interface essentially is to facilitate access to the 
tool’s functionality, whether we are dealing with 
physical tools or with mind tools. We can generalize 
this common notion of interface to define an inter- 
face as the locus of interaction between a person 
and his or her environment (Duchastel, 1996). 

WIMP: A style of graphic user interface that 
involves windows, icons, menus, and pointers. It 
replaced the older textual command style interface, 
and the term is now of historical interest only. 
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INTRODUCTION 

Currently, most of the Web is designed from the 
viewpoint of helping people who know what they 
want but need help accomplishing it. User goals may 
range from buying a new computer to making vaca- 
tion plans. Yet, these are simple tasks that can be 
accomplished with a linear sequence of events. With 
information-rich sites, the linear sequence breaks 
down, and a straightforward process to provide 
users with information in a useful format does not 
exist. 

Users come to information-rich sites with com- 
plex problems they want to solve. Reaching a solu- 
tion requires meeting goals and subgoals by finding 
the proper information. Complex problems are often 
ill-structured; realistically, the complete sequence 
can’t even be defined because of users’ tendencies 
to jump around within the data and to abandon the 
sequence at varying points (Klein, 1999). To reach 
the answer, people need the information properly 
positioned within the situation context (Albers, 2003 ; 
Mirel, 2003a). System support for such problems 
requires users to be given properly integrated infor- 
mation that will assist in problem solving and deci- 
sion making. 

Complex problems normally involve high-level 
reasoning and open-ended problem solving. Conse- 
quently, designer expectations of stable require- 
ments and the ability to perform an exhaustive task 
analysis fall short of reality (Rouse & Valusek, 
1993). While conventional task analysis works for 
well-defined domains, it fails for the ill-structured 
domains of information-rich sites (Albers, 2004). 
Instead of exhaustive task analysis, the designer 
must shift to an analysis focused on providing a clear 
understanding of the situation from the user’s point 
of view and the user’s goals and information needs. 



BACKGROUND 

In today’s world, data almost invariably will come 
from a database. A major failing of many of these 
systems is that they never focus on the human- 
computer interaction. Instead, the internal structure 
of the software or database was reflected in both the 
interface operation and the output. 

The problem is not lack of content. Information- 
rich sites normally have a high information content 
but inefficient design results in low information 
transmission. From the psychological standpoint, the 
information is disseminated ineffectively. The infor- 
mation is not designed for integration with other 
information but rather is optimized for its own pre- 
sentation. As a result, users must look in multiple 
sources to find the information they need. While 
hypertext links serve to connect multiple sources, 
they often are not adequate. Johnson-Eilola and 
Selber (1996) argue that most hypertexts tend to 
maintain the traditional hierarchical organization of 
paper documents. 

Mirel (1996, 2003b) examined the difficulties 
users have with current report design and found that 
sites often provide volumes of information but fail to 
effectively answer a user’s questions. The informa- 
tion needed by professionals exists within the corpo- 
rate database, but with complex problems, there are 
no ready-made answers that can be pulled out with 
simple information retrieval techniques. Thus, it 
cannot be expected that relevant information can be 
found by direct means, but it must be inferred. 
Interestingly (and complicating the design), inferring 
results is what experts do best. While all readers 
need information to be properly integrated, the amount 
of integration and coherence of the information 
required varies. McNamara and her colleagues 
(McNamara, 2001; McNamara & Kintsch, 1996) 
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have found that users with a higher topic knowledge 
level perform better with less integrated informa- 
tion. Following the same idea, Woods and Roth 
(1998) define the critical question as “how knowl- 
edge is activated and utilized in the actual problem- 
solving environment” (p. 420). 

Waern (1989) claims that one reason systems fail 
lies in the differences in perspective between the 
data generator and the information searcher. Much 
of the research on information structuring attempts 
to predefine user needs and, thus, the system breaks 
down when users try to go beyond the solution 
envisioned by the designers. Basden and Flibberd 
(1996) consider how current audience and task 
analysis methods tend to start with an assumption 
that all the information needed can be defined in 
advance and then collected into a database. In this 
view, the knowledge exists as external to the system 
and user. However, for systems that must support 
complex situations, the methods tend to breakdown. 
Spool (2003) found some designs drove people away 
by not answering their questions in the user’s con- 
text. 



DESIGN FOR INFORMATION-RICH 
SYSTEMS 

Interface and content designers increasingly are 
being called upon to address information needs that 
go beyond step-by-step instruction and involve com- 
municating information for open-ended questions 
and problems (Mirel, 1998, 2003b). Applying that 
approach to interface design can enhance user 
outcomes, as such systems can help to organize 
thinking rather than to suggest a course of action 
(Eden, 1988). The questions and problems that users 
bring to information-rich systems only can be ad- 
dressed by providing information specific to a situa- 
tion and presenting it in a way that supports various 
users’ goals and information needs (Albers, 2003). 

Addressing users’ goals and information needs 
breaks with the fundamental philosophy of a design 
created to step a user through a sequence. Complex 
situations contain lots of ambiguity and subtle infor- 
mation nuances. That fact, if nothing more, forces 
the human into the process, since computers simply 
cannot handle ambiguity. From the computer’ s point 
of view, data are never ambiguous (if it has 256 



shades of gray, then it can be assigned to one and 
only one of 256 little bins). The easiest design 
method, one that is much too prevalent, is to ignore 
the ambiguity. The system displays the information 
and leaves it up to the user to sort out the ambiguity. 
From the start, designers must accept that informa- 
tion, since a complex situation cannot be prestructured 
and must be designed to allow users to continuously 
adapt to it. Consequently, many of the standard 
considerations of stable requirements, exhaustive 
task analysis, and ignorance of cognitive interaction 
fail to apply and require reconsideration (Rouse & 
Valusek, 1993). This breakdown between the 
designer’s and the user’s thought processes ex- 
plains why conventional task analysis works for 
well-defined domains but fails for the ill-structured 
domains of information-rich sites (Albers, 2004). 
Instead, the designer must have a clear understand- 
ing of the situation from the user’ s point of view , the 
user’s goals, and the user’s information needs. 

Situation 

The situation is the current world state that the user 
needs to understand. A situation always exists with 
the user embedded within it. To understand a situa- 
tion, a user works within the situation by defining 
goals and searching for the information required to 
achieve the goals. An underlying assumption is that 
the user needs to interact with an information system 
in order to gain the necessary information and to 
understand the situation. In most cases, after under- 
standing the situation, the user will interact with the 
situation, resulting in a change that must be reflected 
in an updated system. 

Goal 

User goals are the high-level view that allows the 
entire situation to be understood in context. To 
maximize understanding, the information should di- 
rectly map onto the goal. Goals could be viewed 
from the user’s viewpoint as plans and from the 
system’s viewpoint as the road map detailing the 
possible routes to follow. Goals can consist of 
subgoals, which are solved in a recursive manner. 
Each goal gets broken into a group of subgoals, 
which may be broken down further, and each subgoal 
must be handled before the goal can be considered 
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achieved. Goals should be considered from the user- 
situation viewpoint (what is happening and what does 
it mean to the user) rather than the system viewpoint 
(how can the system display a value for x). The 
interface provides a pathway for the user to obtain 
the information to achieve the goal. User goals 
provide the means of categorizing and arranging the 
information needs. 

People set goals to guide them through a situation, 
but all people are not the same. Different people 
shape their goals differently and may set completely 
different goals. Rarely will an information-rich site 
be used by a homogeneous group of people sharing a 
common pool of goals. Instead, multiple user groups 
exist, with each group having a different pool of goals 
that must be addressed. These fundamental differ- 
ences arise from the different goals of the user. In a 
highly structured environment, the user’s basic goal 
is essentially one of efficiently completing the task, 
while in the unstructured information-rich environ- 
ment, the user is goal-driven and focused on problem 
solving and decision making. 

Information Needs 

Information needs are the information required for 
the user to achieve a goal. A major aspect of good 
design is ensuring that the information is provided in 
an integrated format that matches the information 
needs to the user goals. Information needs focus on 
the content that users require to address their goals. 
Interestingly and perhaps unfortunately, the content 
often gets short-changed in many design discussions. 
The problem is that content normally is assumed to 
already exist, it can be used as is, and thus, it is outside 
the scope of the human-computer interaction. While 
the content is situation-specific, it never will just 
appear out of nowhere in a fully developed form. 
Also, as a person interacts with a situation, the 
information the person wants for any particular goal 
changes as he or she gets a better grasp of the goal 
and the situation (Albers, 2004). 

The problem of addressing information needs 
extends well beyond having the information available 
and even having it well arranged. As users’ informa- 
tion needs increase, they find it hard to figure out 
what information they need. One study found that 
approximately half of the participants failed to ex- 
tract the proper information for ill-defined problems, 



even when the relevant graphs and illustrations 
were presented to them (Guthrie, Weber, & 
Kimmerly, 1993, as cited in vander Meij, Blijleve & 
Jensen, 2003). Consider how much more difficult 
this can be when a user either does not know or is 
not sure the information exists within the system. 
Yet, the designer’s and technical writer’s jobs are 
to ensure that the user knows that the information 
exists, extracts the proper information, and under- 
stands its relevance. 

A good interface design must define the order in 
which the information must be presented, how it 
should be presented, and what makes it important to 
the situation and to the user’s goal. It also must 
define what information is not needed or not rel- 
evant, even though at first glance it seems impor- 
tant. Since information-rich sites lend themselves to 
a high degree of freedom and a large amount of 
unpredictability, understanding how information 
relates to the goals is imperative to helping users 
address their situations. 

Example: Marketing Analysis as a 
Complex Situation 

Managers have access to a huge amount of data 
that they need to analyze in order to make informed 
decisions. Normally, rather than providing any help 
with interpreting the information, report designers 
take the view of just asking what information is 
desired and ensuring it is contained somewhere 
within the system. 

For example, if a marketing analyst for a coffee 
manufacturer is inquiring into whether a new 
espresso product is likely to succeed in this special- 
ized market, the analyst needs to view , process, and 
interact with a wide range of multi-scaled data. To 
figure out what it will take to break into and become 
competitive in the high-end espresso market, the 
analyst will examine as many markets, espresso 
products, and attributes of products as the analyst 
deems relevant to the company’s goals, and as 
many as the technical tools and cognitive capacity 
enable the analyst to analyze. Looking at these 
products, the analyst will move back and forth in 
scale between the big picture and detailed views. 
The analyst will assess how espresso has fared 
over past and current quarters in different channels 
of distribution, regions, markets, and stores, and 
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impose on the data his or her own knowledge of 
seasonal effects and unexpected market conditions. 
For different brands and products, including varia- 
tions in product by attributes such as size, packaging, 
and flavor, the analyst might analyze 20 factors or 
more, including dollar sales, volume sales, market 
share, promotions, percent of households buying, 
customer demographics, and segmentation. The 
analyst will arrange and rearrange the data to find 
trends, correlations, and two-and three-way causal 
relationships; the analyst will filter data, bring back 
part of them, and compare different views. Each 
time, the analyst will get a different perspective on 
the lay of the land in the espresso world. Each path, 
tangent, and backtracking move will help the analyst 
to clarify the problem, the goal, and ultimately the 
strategic and tactical decisions (Mirel, 2003b). 

By considering report analysis as a complex 
situation, the report interpretation methods do not 
have to be outside of the scope. The information the 
analyst needs exists. The problem is not a lack of 
data but a lack of clear methods and techniques to 
connect that data into an integrated presentation that 
fits the user’s goals and information needs. Rather 
than simply supplying the analyst with a bunch of 
numbers, the report designers should have per- 
formed an analysis to gain a deeper understanding of 
how the numbers are used and should have provided 
support to enable the user to perform that analysis in 
an efficient manner. 



FUTURE TRENDS 

Information-rich Web sites will continue to increase 
as more people expect to gain information via the 
Internet. In general, the information-rich sites focus 
on complex situations that contain too many factors 
to be completely analyzed, so it is essentially impos- 
sible to provide a complete set of information or to 
fully define the paths through the situation. 

In the near term, an artificial intelligence ap- 
proach will not work; with current or near-term 
technology, the computer system cannot come close 
to understanding the situational context and resolv- 
ing ambiguity. Rather, the system and interface 
design must provide proper support in order for users 
to gain a clear understanding of the solutions to their 
goals. Computers and people both excel at different 



tasks; effective design must balance the two and let 
each do what they do best. 

Rather than being dominated by a tool mindset, 
we need to ensure that the technology does not 
override the communication aspects. Addressing 
designs specific to a user’s goals means assuming a 
highly dynamic path with information being molded 
to fit each user group and each individual user. 
Rather than focusing on specific tasks that the 
system can perform, the analysis and design should 
focus on the user’s situation and on the goals to be 
achieved. Understanding the user’s goals, informa- 
tion needs, and information relationships provides a 
solid foundation for placing the entire situation in 
context and for solving the user’s problem. 

CONCLUSION 

With a complex situation, the user’s goal is one of 
problem solving and decision making, based on the 
user’s goals and information needs. As such, the 
user has no single path to follow to accomplish a task 
(Albers, 1997). Unlike the clear stopping point of 
well-defined tasks, with complex tasks, the decision- 
making process continues until the user quits or feels 
confident enough to move forward. 

Any complex situation contains an overabun- 
dance of data. As such, with complex situations, the 
user needs clearly structured information that helps 
to reveal solutions to the open-ended questions and 
provides connections across multiple-task proce- 
dures. Achieving an effective design requires know- 
ing what information is required, how to manipulate 
the information to extract the required knowledge 
from it, and how to construct mental models of the 
situation that can be used to handle unanticipated 
problems (Brown, 1986). 

Properly presented information with the proper 
content effectively addresses the user’s goals. Us- 
ers work within a complex situation with a set of 
open-ended goals that the system design must con- 
sider from the earliest stages (Belkin, 1980). The 
first step in meeting people’s information needs 
requires initially defining their goals and needs. But 
more than just a list of goals and data, the analysis 
also reveals the social and cognitive aspects of 
information processing and the information relation- 
ships within the readers’ mental models. Thus, the 
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goal of an HCI designer is to develop a user- 
recognizable structure that maps onto both the user’s 
mental model of the situation and the situation con- 
text and bridges between them. The collected goals 
and information needs create a vision of the users 
focused on what open-ended questions they want 
answered and why (Mirel, 1998). Everything the 
user sees contributes to the acceptance of the 
information and its ability to support the needs of 
understanding a complex situation. 

As Mirel (2003b) states, “people’s actual ap- 
proaches to complex tasks and problems ... are 
contextually conditioned, emergent, opportunistic, 
and contingent. Therefore, complex work cannot be 
formalized into formulaic, rule-driven, context-free 
procedures” (p. 259). The analysis and design must 
consider the communication needs in complex situ- 
ations and the highly dynamic situational context of 
information, with a focus on the user’s goals and 
information needs as required to support the funda- 
mental user wants and needs. 



REFERENCES 

Albers, M. (1997). Information engineering: Creat- 
ing an integrated interface. Proceedings of the 7 th 
International Conference on Human-Computer 
Interaction. 

Albers, M. (2003). Complex problem solving and 
content analysis. In M. Albers & B. Mazur (Eds.), 
Content and complexity: Information design in 
software development and documentation (pp. 
263-284). Mahwah, NJ: Lawrence Erlbaum Associ- 
ates. 

Albers, M. (2004). Design for complex situations: 
Analysis and creation of dynamic Web informa- 
tion. Mahwah, NJ: Lawrence Erlbaum Associates. 

Basden, A., & Hibberd, P. (1996). User interface 
issues raised by knowledge refinement. Interna- 
tional Journal of Human-Computer Studies, 45, 
135-155. 

Belkin, N. (1980). Anomalous states of knowledge 
as a basis for information retrieval. The Canadian 
Journal of Information Science, 5, 133-143. 



Brown, J. ( 1986). From cognitive to social ergonom- 
ics and beyond. In D. Norman & S. Draper (Eds.), 
User centered system design : New perspectives 
on human-computer interaction (pp. 457-486). 
Mahwah, NJ: Lawrence Erlbaum Associates. 

Eden, C. (1988). Cognitive mapping. European 
Journal of Operational Research, 36, 1-13. 

Johnson-Eilola, J., & Selber, S. (1996). After auto- 
mation: Hypertext and corporate structures. In P. 
Sullivan, & J. Dautermann. (Eds.), Electronic 
literacies in the workplace (pp. 115-141). Urbana, 
IL: NCTE. 

Klein, G. (1999). Sources of power: How people 
make decisions. Cambridge, MA: MIT. 

McNamara, D. (2001). Reading both high and low 
coherence texts: Effects of text sequence and prior 
knowledge. Canadian Journal of Experimental 
Psychology, 55, 51-62. 

McNamara, D., & Kintsch, W. (1996). Learning 
from text: Effects of prior knowledge and text 
coherence. Discourse Processes, 22, 247-287. 

Mirel, B. (1996). Writing and database technology: 
Extending the definition of writing in the workplace. 
In P. Sullivan & J. Dautermann. (Eds.), Electronic 
literacies in the workplace (pp. 91-114). Urbana, 
IL: NCTE. 

Mirel, B. (1998). Applied constructivism for user 
documentation. Journal of Business and Techni- 
cal Communication, 72(1), 7-49. 

Mirel, B. (2003a). Interaction design for complex 
problem solving: Getting the work right. San 
Francisco: Morgan Kaufmann. 

Mirel, B. (2003b). Design strategies for complex 
problem-solving software. In M. Albers & B. Mazur 
(Eds.), Content and complexity: Information de- 
sign in software development and documentation 
(pp. 255-284). Hillsdale, NJ: Earlbaum. 

Rouse, W., & Valusek, J. (1993). Evolutionary 
design of systems to support decision making. In G. 
Klein, J. Orasanu, R. Calderwood, & C. Zsambok 
(Eds.), Decision making in action: Models and 
methods (pp. 270-286). Norwood, NJ: Ablex. 



342 



Information Rich Systems and User’s Goals and Information Needs 



Spool, J. (2003). 5 things to know about users. 
Retrieved June 17, 2003, from http:// 

www.uiconf.com/7west/five_things_to_know_ 
article.htm 

van der Meij, H., Blijleven, P., & Jansen, L. (2003). 
What makes up a procedure? In M. Albers & B. 
Mazur (Eds.), Content and complexity: Informa- 
tion design in software development and docu- 
mentation (pp. 129-186). Mahwah, NJ: Lawrence 
Erlbaum Associates. 

Waern, Y. (1989). Cognitive aspects of computer 
supported tasks. New York: Wiley. 

Woods, D., & Roth. E. (1988). Cognitive engineer- 
ing: Human problem solving with tools. Human 
Factors, 39(4), 415-430. 

KEY TERMS 

Complex Situation: The current world state 
that the user needs to understand. The understand- 



ing in a complex situation extends beyond procedural 
information and requires understanding the dynamic 
interrelationships of large amounts of information. 

Information Needs: The information that con- 
tributes to solving a goal. This information should be 
properly integrated and focused on the goal. 

Information-Rich Web Site: A Web site de- 
signed to provide the user with information about a 
topic, such as a medical site. In general, they contain 
more information than a user can be expected to 
read and understand. 

Situational Context: The details that make the 
situation unique for the user. 

User Goals: The specific objectives that a user 
wants to solve. In most complex situations, goals 
form a hierarchy with multiple tiers of subgoals that 
must be addressed as part of solving the primary 
goal. 
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INTRODUCTION 

Human-Computer Interaction (HCI) in the 21 st cen- 
tury needs to look very different from its 20 th - 
century origins. Computers are becoming ubiqui- 
tous; they are disappearing into everyday objects. 
They are becoming wearable. They are able to 
communicate with each other autonomously, and 
they are becoming self-adaptive. Even with some- 
thing as ubiquitous as the mobile phone, we see a 
system that actively searches out a stronger signal 
and autonomously switches transmitters. Predictive 
techniques allow phones to adapt (e.g., anticipate 
long telephone numbers). These changes in tech- 
nologies require us to change our view of what HCI 
is. 

The typical view of how people interact with 
computers has been based primarily on a cognitive 
psychological analysis (Norman & Draper, 1986) of 
a single user using a single computer. This view sees 
the user as outside the computer. People have to 
translate their intentions into the language of the 
computer and interpret the computer’s response in 
terms of how successful they were in achieving their 
aims. This view of HCI leads to the famous gulfs of 
execution (the difficulty of translating human inten- 
tions into computer speak) and evaluation (trying to 
interpret the computer’s response). 

With the ubiquity of information appliances 
(Norman, 1999) or information artifacts (Benyon et 
al. 1999), the single-person, single-computer view of 
HCI becomes inadequate. We need to design for 
people surrounded by information artefacts. People 
no longer are simply interacting with a computer; 
they are interacting with people using various com- 
binations of computers and media. As computing 
devices become increasingly pervasive, adaptive, 
embedded in other systems, and able to communi- 
cate autonomously, the human moves from outside 
to inside an information space. In the near future, the 
standard graphical user interface will disappear for 
many applications, the desktop will disappear, and 



the keyboard and mouse will disappear. Information 
artefacts will be embedded both in the physical 
environment and carried or worn by people as they 
move through that environment. 

This change in the nature of computing demands 
a change in the way we view HCI. We want to move 
people from outside a computer, looking in to the 
world of information, to seeing people as inside 
information space. When we think of having a 
meeting or having a meal, we do not see people as 
outside these activities. People are involved in the 
activity. They are engaged in the interactions. In an 
analogous fashion, we need to see people as inside 
the activities of information creation and exchange, 
as inside information space. 

BACKGROUND 

The notion that we can see people as existing in and 
navigating through an information space (or multiple 
information spaces) has been suggested as an alter- 
native conceptualization of HCI (Benyon & Hook, 
1997). Looking at HCI in this way means looking at 
HCI design as the creation of information spaces 
(Benyon, 1998). Information architects design infor- 
mation spaces. Navigation of information space is 
not a metaphor for HCI. It is a paradigm shift that 
changes the way that we look at HCI. The concep- 
tion has influenced and been influenced by new 
approaches to systems design (McCall & Benyon, 
2002), usability (Benyon, 2001), and information 
gathering (Macaulay et al., 2000). 

The key concepts have developed over the years 
through experiences of developing databases and 
other information systems and through studying the 
difficulties and contradictions in traditional HCI. 
Within the literature, the closest ideas are those of 
writers on distributed cognition (Hutchins, 1995). A 
related set of ides can be found in notions of re- 
sources that aid action (Wright et al., 2000). In both 
of these, we see the recognition that cognition simply 
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does not take place in a person’s head. Cognition 
makes use of things in the world — cognitive artefacts, 
in Hutchins’ terms. If you think about moving through 
an urban landscape, you may have a reasonable plan 
in mind. Y ou have a reasonable representation of the 
environment in terms of a cognitive map (Tversky, 
1993). But you constantly will be using cues and 
reacting to events. You may plan to cross the road 
at a particular place, but exactly where and when 
you cross the road depends on the traffic. Plans and 
mental models constantly are being reworked to take 
account of ongoing events. Navigation of informa- 
tion space seeks to make explicit the ways in which 
people move among sources of information and 
manage their activities in the world. 

MAIN FOCUS OF THE ARTICLE 

Navigation of information space is a new paradigm 
for thinking about HCI, just as direct manipulation 
was a new paradigm in the 1980s. Navigation of 
information space suggests that people are naviga- 
tors and encourages us to look to approaches from 
physical geography, urban studies, gardening, and 
architecture in order to inspire designs. Navigation 
of information space requires us to explore the 
concept of an information space, which, in turn, 
requires us to look at something that is not an 
information space. We conceptualize the situation 
as follows. The activity space is the space of real- 
world activities. The activity space is the space of 
physical action and physical experiences. In order to 
undertake activities in the activity space, people 
need access to information. At one level of descrip- 
tion, all our multifarious interactions with the expe- 
rienced world are effected through the discovery, 
exchange, organization, and manipulation of infor- 
mation. Information spaces are not the province of 
computers. They are central to our everyday expe- 
riences and go from something as simple, for ex- 
ample, as a sign for a coffee machine, a public 
information kiosk, or a conversation with another 
person. 

Information spaces often are created explicitly to 
provide certain data and certain functions to facili- 
tate some activity — to help people plan, control, and 
monitor their undertakings. Information system de- 
signers create information artefacts by conceptual- 



izing some aspect of an activity space and then 
selecting and structuring some signs in order to make 
the conceptualization available to other people. Us- 
ers of the information artefact engage in activities by 
performing various processes on the signs. They 
might select items of interest, scan for some general 
patterns, search for a specific sign, calculate some- 
thing, and so forth. 

Both the conceptualization of the activity space 
and the presentation of the signs are crucial to the 
effectiveness of an information artefact to support 
some activity. Green and Benyon (1996) and Benyon, 
et al. (1999) provide many examples of both paper- 
based and computer-based information artefacts 
and the impact that the structuring and presentation 
have on the activities that can be supported with 
different conceptualizations of activity spaces and 
different presentations or interfaces on those 
conceptualizations. For example, they discuss the 
different activities that are supported by different 
reference styles used in academic publications, such 
as the Harvard style (the author’s name and date of 
publication, as used as in this article) and the Nu- 
meric style (when a reference is presented in a 
numbered list). Another example is the difference 
between a paper train timetable and a talking time- 
table, or the activities that are supported by the 
dictionary facility in a word processor. 

All information artefacts employ various signs 
structured in some fashion and provide functions to 
manipulate those signs (conceptually and physi- 
cally). I can physically manipulate a paper timetable 
by marking it with a pen, which is something I cannot 
do with a talking timetable. I can conceptually 
manipulate it by scanning for arrival times, which is 
something I cannot do with a talking timetable. So, 
every information artefact constrains and defines an 
information space. This may be defined as the signs, 
structure, and functions that enable people to store, 
retrieve, and transform information. Information 
artefacts define information spaces, and information 
spaces include information artefacts. Information 
artefacts also are built on top of one another. Since 
an information artefact consists of a 
conceptualization of some aspect of the activity 
space and an interface that provides access to that 
conceptualization whenever a perceptual display (an 
interface) is created, it then becomes an object in the 
activity space. Consequently, it may have its own 
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Figure 1. Conceptualization of information space 
and activities 




information artefact designed to reveal information 
about the display. 

In addition to information artefacts, information 
spaces may include agents and devices. Agents are 
purposeful. Unlike information artefacts that wait to 
be accessed, agents actively pursue some goal. People 
are agents, and there are artificial agents such as 
spell checkers. Finally, there are devices. These are 
entities that do not deal with the semantics of any 
signals that they receive. They transfer or translate 
signs without dealing with meanings. Thus, we con- 
ceive of the situation as illustrated in Figure 1 , where 
various activities are supported by information spaces. 
Note that a single activity rarely is supported by a 
single information artefact. Accordingly , people have 
to move across and between the agents, information 
artefacts, and devices in order to undertake their 
activities. They have to navigate the information 
space. 

FUTURE TRENDS 

The position that we are moving toward — and the 
reason that we need a new HCI — is people and other 
agents existing inside information spaces. Of course, 
there always will be a need to look at the interaction 
of people with a particular device. But in addition to 
these traditional HCI issues are those concerned 
with how people can know what a particular device 
can and cannot do and with devices knowing what 
other devices can process or display. 



The conceptualization of HCI as the navigation 
of information spaces that are created from a 
network of interacting agents, devices, and infor- 
mation artefacts has some important repercussions 
for design. Rather than designing systems that 
support existing human tasks, we are entering an 
era in which we develop networks of interacting 
systems that support domain-oriented activities 
(Benyon, 1997). That is to say that we need to think 
about the big picture in HCI. We need to think about 
broad activities, such as going to work in the morn- 
ing, cooking a meal for some friends, and how a 
collection of information artefacts both can support 
and make these activities enjoyable and rewarding. 

In its turn, this different focus makes HCI shift 
attention from humans, computers, and tasks to 
communication, control, and the distribution of do- 
main knowledge between the component agents 
and devices that establish the information space. 
We need to consider the transparency, visibility, 
and comprehensibility of agents and information 
artefacts, the distribution of trust, authority, and 
responsibility in the whole system, and issues of 
control, problem solving, and the pragmatics of 
communication. Users are empowered by having 
domain-oriented configurable agents and devices 
with which they communicate and share their knowl- 
edge. 

CONCLUSION 

Understanding people as living inside information 
spaces represents a new paradigm for thinking 
about HCI and, indeed, about cognition. There has 
been a failure of traditional cognitive science in its 
concept of mental representations both in terms of 
our ability to build intelligent machines and in our 
attempts to create really effective interactions be- 
tween people and computers. This new conception 
draws upon our spatial skills and spatial knowledge 
as its source. We have learned much about how to 
design to help people move through the built envi- 
ronment that we can apply to the design of informa- 
tion spaces. We understand that views of cognition 
based exactly on a spatial conception (Lakoff & 
Johnson, 1 999) are providing new insights. N aviga- 
tion of information space can be seen as part of this 
development. 
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KEY TERMS 

Agent: An entity that possesses some function 
that can be described as goal-directed. 

Device: An entity that does not deal with infor- 
mation storage, retrieval, or transmission but only 
deals with the exchange and transmission of data. 

Information: Data that is associated with some 
system that enables meaning to be derived by some 
entity. 

Information Artefact: Any artefact whose pur- 
pose is to allow information to be stored, retrieved, 
and possibly transformed. 

Information Space: A collection of information 
artefacts and, optionally, agents and devices that 
enable information to be stored, retrieved, and pos- 
sibly transformed. 

Navigation of Information Space: (1) The 

movement through and between information artefacts, 
agents, and devices; (2) the activities designed to 
assist in the movement through and between infor- 
mation artefacts, agents, and devices. 
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INTRODUCTION 

A computer-aided education environment not only 
extends education opportunities beyond the tradi- 
tional classroom, but it also provides opportunities 
for intelligent interface based on agent-based tech- 
nologies to better support teaching and learning 
within traditional classrooms. Advances in informa- 
tion technology, such as the Internet and multimedia 
technology, have dramatically enhanced the way 
that information and knowledge are represented and 
delivered to students. The application of agent- 
based technologies to education can be grouped into 
two primary categories, both of which are highly 
interactive interfaces: (1) intelligent tutoring sys- 
tems (ITS) and (2) interactive learning environ- 
ments (ILE) (McArthur, Lewis, & Bishay, 1993). 
Current research in this area has looked at the 
integration of agent technology into education sys- 
tems. However, most agent-based education sys- 
tems under utilize intelligent features of agents such 
as reactivity, pro-activeness, social ability 
(Wooldridge & Jennings, 1995) and machine learn- 
ing capabilities. Moreover, most current agent-based 
education systems are simply a group of non-col- 
laborative (i.e., non-interacting) individual agents. 
Finally, most of these systems do not peruse the 
multi-agent intelligence to enhance the quality of 
service in terms of content provided by the inter- 
faces. 

A multi-agent system is a group of agents where 
agents interact and cooperate to accomplish a task, 
thereby satisfying goals of the system design (Weiss, 
1999). A group of agents that do not interact and do 
not peruse the information obtained from such inter- 
actions to help them make better decisions is simply 
a group of independent agents, not a multi-agent 



system. To illustrate this point, consider an ITS that 
has been interacting with a particular group of 
students and has been collecting data about these 
students. Next, consider another ITS which is in- 
voked to deal with a similar group of students. If the 
second ITS could interact with the first ITS to obtain 
its data, then the second ITS would be able to handle 
its students more effectively, and together the two 
agents would comprise a multi-agent system. 

Most ITS or ILE systems in the literature do not 
utilize the power of a multi-agent system. The 
Intelligent Multi-agent Infrastructure for Distributed 
Systems in Education (I-MINDS) is an exception. It 
is comprised of a multi-agent system (MAS) infra- 
structure that supports different high-performance 
distributed applications on heterogeneous systems 
to create a computer-aided, collaborative learning 
and teaching environment. In our current I-MINDS 
system, there are two types of agents: teacher 
agents and student agents. A teacher agent gener- 
ally helps the instructor manage the real-time class- 
room. In I-MINDS, the teacher agent is unique in 
that it provides an automated ranking of questions 
from the students. This innovation presents ranked 
questions to the classroom instructor and keeps 
track of a profile of each class participant reflecting 
how they respond to the class lectures. A student 
agent supports a class participant’s real-time class- 
room experience. In I-MINDS, student agents 
innovatively support the buddy group formation. A 
class participant’s buddy group is his or her support 
group. The buddy group is a group of actual students 
that every student has access to during real-time 
classroom activities and with which they may dis- 
cuss problems. Each of these agents has its inter- 
face which, on one hand, interacts with the user and, 
on the other hand, receives information from other 
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agents and presents those to the user in a timely 
fashion. 

In the following, we first present some back- 
ground on the design choice of I-MINDS. Second, 
we describe the design and implementation of I- 
MINDS in greater detail, illustrating with concrete 
examples. We finalize with a discussion of future 
trends and some conclusions drawn from the current 
design. 

BACKGROUND 

In this section, we briefly describe some virtual 
classrooms, interactive learning environments (ILE), 
and intelligent tutoring systems (ITS) — listing some 
of the features available in these systems — and then 
compare these systems with I-MINDS. The objec- 
tive of this section is to show that I-MINDS pos- 
sesses most and, in some cases, more advanced 
functionalities and features than those found in other 
systems. What sets I-MINDS significantly apart 
from these systems is the multi-agent infrastructure 
where intelligent agents not only serve their users, 
but also interact among themselves to share data and 
information. Before moving further, we will provide 
some brief definitions of these systems. A virtual 
classroom is an environment where the students 
receive lectures from an instructor. An ILE is one 
where either the students interact among them- 
selves, or with the instructor, or both to help them 
learn. An ITS is one where an individual student 
interacts with a computer system that acts as a tutor 
for that student. At its current design, I-MINDS is a 
full-fledged virtual classroom with an ILE, and has 
the infrastructure for further development into a 
system of intelligent tutors. I-MINDS currently has 
a complete suite of similar multimedia support fea- 
tures, important in virtual classrooms and interactive 
learning environments: live video and audio broad- 
casts, collaborative sessions, online forums, digital 
archival of lectures and discussions, text overlay on 
blackboard, and other media. The uniqueness of I- 
MINDS is that the features of its interactive learning 
environment and virtual classroom are supported by 
intelligent agents. These agents work individually to 
serve their users and collaboratively to support 
teaching and learning. 



Most ITSs such as AutoTutor (Graesser, Wiemer- 
Hastings, Wiemer-Hastings, Kreuz, & the Tutoring 
Research Group, 1999) have not been considered in 
the context of a multi-agent system. For example, 
one ITS A may store useful information about the 
types of questions suitable for a certain type of 
student based on its own experience. Another ITS B 
encounters such a student but fails to provide ques- 
tions that are suitable since it does not know yet how 
to handle this type of student. If the two ITSs can 
collaborate and share what they know, then B can 
learn from A to provide more suitable questions to 
the student. In systems such as AutoTutor, agents 
do not interact with other agents to exchange their 
experiences or knowledge bases. I-MINDS is dif- 
ferent in this regard. First, an agent in I-MINDS is 
capable of machine learning. A teacher agent is able 
to learn how to rank questions better as it receives 
feedback from the environment. A student agent is 
able to learn to more effectively form a buddy group 
for its student. Further, these student agents interact 
with each other to exchange information and expe- 
rience. 



I-MINDS 

The I-MINDS project has three primary areas of 
research: (a) distributed computing (i.e., the infra- 
structure and enabling technology), (b) intelligent 
agents, and (c) the specific domain application in 
education and instructional design. Our research on 
distributed computing examines consistency, 
scalability, and security in resource sharing among 
multiple processes. In our research on intelligent 
agents, we study interactions between teacher agent 
and student agents, and among student agents. For 
our application in education, we focus on automated 
question ranking by the teacher agent and buddy 
group formation by the student agents. 

In this section, we will focus our discussions on 
the intelligent agents and the multi-agent system and 
briefly on the instructional design. Readers are 
referred to Liu, Zhang, Soh, Al-Jaroodi, and Jiang 
(2003) for a discussion on distributed computing in I- 
MINDS using a Java object-oriented approach, to 
Soh, Liu, Zhang, Al-Jaroodi, Jiang, and Vemuri 
(2003) for a discussion on a layered architecture and 
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proxy-supported topology to maintain a flexible and 
scalable design at the system level, and to Zhang, 
Soh, Jiang, and Liu (2005) for a discussion on the 
multi-agent system infrastructure. 

The most unique and innovative aspect of I- 
MINDS when applied to education is the usage of 
agents that work individually behind-the-scenes and 
that collaborate as a multi-agent system. There are 
two types of agents in I-MINDS: teacher agents and 
student agents. In general, a teacher agent serves an 
instructor, and a student agent serves a student. 

Teacher Agent 

In I-MINDS, a teacher agent interacts with the 
instructor and other student agents. The teacher 
agent ranks questions automatically for the instructor 
to answer, profiles the students through its interac- 
tion with their respective student agents, and im- 
proves its knowledge bases to better support the 
instructor. 

In our current design, the teacher agent evaluates 
questions and profiles students. The teacher agent 
has a mechanism that scores a question based on the 
profile of the student who asks the question and the 
quality of the question itself. A student who has been 
asking good questions will be ranked higher than a 
student who has been asking poor questions. A good 
question is based on the number of weighted key- 



words that it contains and whether it is picked by the 
instructor to answer in real-time. 

The teacher agent also has a self-learning com- 
ponent, which lends intelligence to its interface. In 
our current design, this component allows the agent 
to improve its own knowledge bases and its perfor- 
mance in evaluating and ranking questions. When a 
new question is asked, the teacher agent first 
evaluates the question and scores it. Then the 
teacher agent inserts the question into a ranked 
question list (based on the score of the question and 
the heuristic rules, to be described later) and dis- 
plays the list to the instructor. The instructor may 
choose which questions to answer. Whenever the 
instructor answers a question, he or she effectively 
“teaches” the teacher agent that the question is 
indeed valuable. If the question had been scored 
and ranked high by the teacher agent, this selection 
reinforces the teacher agent’ s reasoning. This posi- 
tive reinforcement leads to the increased weights 
for the heuristics and keywords that had contrib- 
uted to the score and rank of the question, and vice 
versa. 

Figure 1 shows a screen snapshot of our teacher 
agent’s interface. The snapshot shows three com- 
ponents. First, the main window displays the lecture 
materials that could be a whiteboard (captured with 
a Mimios-based technology), a Web page, and any 
documents that appear on the computer screen. 



Figure 1. Screen snapshot of the I-MINDS teacher agent 
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The Mimios-based technology transforms an ordi- 
nary whiteboard to a digital one. It comes with a 
sensor mounted on the ordinary whiteboard and a 
stylus that transmits a signal when pressed against 
the whiteboard when a person writes with the stylus. 
The sensor receives the signal and displays the 
movement of the handwriting on the whiteboard into 
the computer. In Figure 1, the lecture material 
happens to be a Microsoft PowerPoint slide on 
buffer zones, a topic in Geographic Information 
Systems (GIS). Second, the figure has a small 
toolbar, shown here at the top-left corner of the 
snapshot. Only an instructor can view and use this 
toolbar. This toolbar allows the instructor to save 
and/or transmit a learning material and change the 
annotation tools (pens, erasers, markers, and col- 
ors). Third, the snapshot shows a question display 
window at the bottom right corner. Once again, only 
the instructor can view and use this question display. 
The question display summarizes each question, 
ranked based on their scores. The display window 
also has several features. For example, an instructor 
may choose to answer or discard a question, may 
view the entire question, and may review the profile 
of the student who asked a particular question. 
Alternatively, the instructor may choose to hide the 
toolbar and the question’s display window so as not 
to interfere with her/his lecture materials. 

Student Agents 

A student agent supports the student whom it serves, 
by interacting with the teacher agent and other 
student agents. It obtains student profiles from the 
teacher agent, forms the student’s buddy group, 
tracks and records the student activities, and pro- 
vides multimedia support for student collaboration. 
First, each student agent supports the formation of a 
“buddy group,” which is a group of students with 
complementary characteristics (or profiles) who 
respond to each other and work together in online 
discussions. A student may choose to form his or her 
own buddy group if he or she knows about the other 
students and wants to include them in his or her 
buddy group. However, for students who do not 
have that knowledge, especially for remote students, 
the student agent will automatically form a buddy 
group for its student. I-MINDS also has two collabo- 



rative features that are used by the buddy groups: a 
forum and a whiteboard. The forum allows all bud- 
dies to ask and answer questions, with each message 
being color-coded. Also, the entire forum session is 
digitally archived, and the student may later review 
the session and annotate it through his or her student 
agent. The whiteboard allows all buddies to write, 
draw, and annotate on a community digital whiteboard. 
The actions on the whiteboard are also tracked and 
recorded by the student agent. 

Note that the initial formation of a buddy group is 
based on the profile information queried from the 
teacher agent and preferences indicated by the 
student. Then, when a student performs a collabora- 
tive activity (initiating a forum discussion or a 
whiteboard discussion, or asking a question), the 
student agent informs other student agents identified 
as buddies within the student’s buddy group of this 
activity. Thus, buddies may answer questions that 
the instructor does not have time to respond to in 
class. As the semester moves along, the student 
agent ranks the buddies based on their responsive- 
ness and helpfulness. The student agent will drop 
buddies who have not been responsive from the 
buddy group. The student agent also uses heuristics 
to determine “when to invite/remove a buddy” and 
“which buddy to approach for help.” The student 
agent adjusts its heuristic rules according to the 
current classroom environment. 

Figure 2 shows a screen snapshot of the I- 
MINDS student agent, which is divided into four 
major quadrants. The top-left quadrant is the win- 
dow that displays in real-time the lecture materials 
delivered from the teacher agent to each student 
agent. When the instructor changes a page, for 
example, the teacher agent will send the new page 
to the student agent. The student agent duly displays 
it. Further, when the instructor writes on a page, the 
teacher agent also transmits the changes to the 
student agent to display them for the student. The 
top-right quadrant is broken up into two sub-regions. 
On the top is a real-time video feed from the teacher 
agent. On the bottom is the digital archival repository 
of the lecture pages. A student may bring up and 
annotate each page. For example, he/she might 
paste a question onto a page and send it back to the 
instructor as a “question with a figure.” On the 
bottom-left quadrant is the forum. Each message 
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Figure 2. Screen snapshot of the I-MINDS student agent 
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posted is colour-coded and labelled with the ID of 
the student who posted the message. On the bottom- 
right quadrant is the set of controls for asking 
questions. A student can type in his or her questions 
here, and then send the questions to the instructor, to 
the buddy group, to both the instructor and buddy 
group, or to a particular student in the buddy group. 
A student can also invite other students to join his or 
her buddy group through the “invite” function found 
in this quadrant. 

The student agent interface has a menu bar on 
top, with menus in “Class,” “Presentation,” “Fo- 
rum,” “Slides,” “Collaboration,” and “Help.” The 
“Class” menu has features pertinent to registration, 
login, and setup of a class lecture. The “Presenta- 
tion” menu contains options on the lecture pages 
such as sound, colours, annotations, and so forth. 
The “Forum” menu allows a student to setup and 
manage his or her forums. The “Slides” menu allows 
a student to archive, search and retrieve, and in 
general, manage all the archived lecture pages. 
Finally, the “Collaboration” menu provides options 
on features that support collaborative activities — 
grabbing a token of the digital whiteboard, initiating 
a digital whiteboard discussion, turning off the auto- 
mated buddy group formation, and so on. 



FUTURE TRENDS 

We see that, in the future, in the area of computer- 
aided education systems, multi-agent intelligence 
will play an important role in several aspects: (1) 
online cooperative or collaborative environment will 
become more active as personal student agents 
become more pro-active and social in exchanging 
information with other student agents to better serve 
the learners, (2) remote learners in the scenario of 
distance education will enjoy virtual classroom inter- 
faces that can anticipate the needs and demands of 
the learners, and seamlessly situate the remote 
learners in a real classroom virtually, and (3) inter- 
faces that can adapt their functions and arrange- 
ments based on the role of, the information gathered 
by, and the group activities participated by the 
agents operating behind the interfaces. 

CONCLUSION 

We have built a multi-agent infrastructure and inter- 
face, I-MINDS, aimed at helping instructors teach 
better and students learn better. The I-MINDS 
framework has many applications in education, due 
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to its agent-based approach and real-time capabili- 
ties such as real-time in-class instructions with 
instant data gathering and information dissemina- 
tion, unified agent and distributed computing archi- 
tecture, group learning, and real-time student re- 
sponse monitoring. The individual interfaces are 
able to provide timely services and relevant informa- 
tion to their users with the support provided by the 
intelligent agents working behind-the-scenes. We 
have conducted a pilot study using two groups of 
students in actual lectures (Soh, Jiang, & Ansorge, 
2004). One group is supported by I-MINDS in which 
the teacher delivered the lectures remotely and the 
students collaborated and interacted via the virtual 
classroom. The pilot study demonstrated some indi- 
cators of the effectiveness and feasibility of I- 
MINDS. Future work includes deploying I-MINDS 
in an actual classroom, incorporating cooperative 
learning and instructional design into the agents, and 
carrying out further studies on student learning. 
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KEY TERMS 

Agent: A module that is able to sense its envi- 
ronment, receive stimuli from the environment, make 
autonomous decisions and actuate the decisions, 
which in turn change the environment. 

Computer-Supported Collaborative Learn- 
ing: The process in which multiple learners work 
together on tasks using computer tools that leads to 
learning of a subject matter by the learners. 

Intelligent Agent: An agent that is capable of 
flexible behaviour: responding to events timely, ex- 
hibiting goal-directed behaviour and social behaviour, 
and conducting machine learning to improve its own 
performance over time. 

Intelligent Tutoring System (ITS): A soft- 
ware system that is capable of interacting with a 
student, providing guidance in the student’s learning 
of a subject matter. 

Interactive Learning Environment (ILE): A 

software system that interacts with a learner and 
may immerse the learner in an environment condu- 
cive to learning; it does not necessarily provide 
tutoring for the learner. 
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Machine Learning: The ability of a machine to Virtual Classroom: An online learning space 
improve its performance based on previous results. where students and instructors interact. 

Multi-Agent System: A group of agents where 
agents interact to accomplish tasks, thereby satisfy- 
ing goals of the system design. 
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INTRODUCTION 

Time stretching, sometimes also referred to as time 
scaling, is a term describing techniques for replaying 
speech signals faster (i.e. , time compressed) or 
slower (i.e., time expanded) while preserving their 
characteristics, such as pitch and timbre. One ex- 
ample for such an approach is the SOLA (synchro- 
nous overlap and add) algorithm (Roucus & Wilgus, 
1985), which is often used to avoid cartoon-charac- 
ter-like voices during faster replay. Many studies 
have been carried out in the past in order to evaluate 
the applicability and the usefulness of time stretch- 
ing for different tasks in which users are dealing with 
recorded speech signals. One of the most obvious 
applications of time compression is speech skim- 
ming, which describes the actions involved in quickly 
going through a speech document in order to identify 
the overall topic or to locate some specific informa- 
tion. Since people can listen faster than they talk, 
time-compressed audio, within reasonable limits, 
can also make sense for normal listening, especially 
in view of He and Gupta (200 1 ), who suggest that the 
future bottleneck for consuming multimedia con- 
tents will not be network bandwidth but people’s 
limited time. In their study, they found that an upper 
bound for sustainable speedup during continuous 
listening is at about 1 .6 to 1 .7 times the normal speed. 
This is consistent with other studies such as Galbraith, 
Ausman, Liu, and Kirby (2003) or Harrigan (2000), 
indicating preferred speedup ratios between 1.3 and 
1.8. Amir, Ponceleon, Blanchard, Petkovic, 
Srinivasan, and Cohen (2000) found that, depending 
on the text and speaker, the best speed for compre- 
hension can also be slower than normal, especially 
for unknown or difficult contents. 



BACKGROUND 

While all the studies discussed in the previous sec- 
tion have shown the usefulness of time stretching, 
the question remains how this functionality is best 
presented to the user. Probably the most extensive 
and important study of time stretching in relation to 
user interfaces is the work done by Barry Arons in 
the early and mid 1990s. Based on detailed user 
studies, he introduced the SpeechSkimmer interface 
(Arons, 1994, 1997), which was designed in order to 
make speech skimming as easy as scanning printed 
text. To achieve this, the system incorporates time- 
stretching as well as content-compression tech- 
niques. Its interface allows the modification of speech 
replay in two dimensions. By moving a mark verti- 
cally, users can slow down replay (by moving the 
mark down) or make it faster (by moving the mark 
upward), thus enabling time-expanded or time-com- 
pressed replay. In the horizontal dimension, content- 
compression techniques are applied. With content 
compression, parts of the speech signal whose con- 
tents have been identified as less relevant or unim- 
portant are removed in order to speed up replay. 
Importance is usually estimated based on automatic 
pause detection or the analysis of the emphasis used 
by the speaker. With SpeechSkimmer, users can 
choose between several discrete browsing levels, 
each of which removes more parts of the speech 
signal that have been identified as less relevant than 
the remaining ones. Both dimensions can be com- 
bined, thus enabling time as well as content com- 
pression during replay at the same time. In addition, 
SpeechSkimmer offers a modified type of backward 
playing in which small chunks of the signal are 
replayed in reverse order. It also offers some other 
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features, such as bookmark-based navigation or 
jumps to some outstanding positions within the speech 
signal. The possibility to jump back a few seconds 
and switch back to normal replay has proven to be 
especially useful for search tasks. Parts of these 
techniques and interface design approaches have 
been successfully used in other systems (e.g., 
Schmandt, Kim, Lee, Vallejo, & Ackerman, 2002; 
Stifelman, Arons, & Schmandt, 2001). 

Current media players have started integrating 
time stretching into their set of features as well. 
Here, faster and slower replay is usually provided in 
the interface by either offering some buttons that 
can be used to set replay speed to a fixed, discrete 
value, or by offering a slider-like widget to continu- 
ously modify replay speed in a specific range. It 
should be noted that if the content-compression part 
is removed from the SpeechSkimmer interface, the 
one-dimensional modification of replay speed by 
moving the corresponding mark vertically basically 
represents the same concept as the slider-like wid- 
get to continuously change replay speed in common 
media players (although a different orientation and 
visualization has been chosen). 

Figure la illustrates an example of a slider-like 
interface, subsequently called a speed controller, 
which can be used to adapt speech replay to any 
value between 0.5 and 3.0 times the normal replay 
rate. Using such a slider to select a specific replay 
speed is very intuitive and useful if one wants to 
continuously listen to speech with a fixed time- 
compressed or time-expanded replay rate. How- 
ever, this interface design might have limitations in 
more interactive scenarios such as information seek- 
ing, a task that is characterized by frequent speed 
changes together with other types of interaction 
such as skipping irrelevant parts or navigating back 



and forth. For example, one disadvantage of the 
usual speed controllers concerns the linear scale. 
The study by Amir et al. (2000) suggests that 
humans’ perception of time-stretched audio is pro- 
portional to the logarithm of the speedup factor 
rather than linear in the factor itself. So, an increase 
from, say, 1.6 to 1.8 times the normal speed is 
perceived as more dramatic than changing the ratio 
from 1 .2 to 1 .4. Thus, the information provided by a 
linear slider scale may be irrelevant or even counter- 
productive. In any case, explicitly selecting a spe- 
cific speedup factor does not seem to be the most 
intuitive procedure for information seeking. 

INTERACTIVE SPEECH SKIMMING 
WITH THE ELASTIC AUDIO SLIDER 

In addition to a speed controller, common media 
players generally include an audio-progress bar that 
indicates the current position during replay (see 
Figure lb). By dragging the thumb on such a bar, 
users can directly access any random part within the 
file. However, audio replay is usually paused or 
continued normally while the bar’ s thumb is dragged. 
The reason why there is no immediate audio feed- 
back is that the movements of the thumb performed 
by the users are usually too fast (if the thumb is 
moved quickly over larger distances), too slow (if the 
thumb is moved slowly or movement is paused), or 
too jerky (if the scrolling direction is changed quickly, 
if the user abruptly speeds up or jerks to a stop, etc.). 
Therefore, it is critical and sometimes impossible to 
achieve a comprehensible audio feedback, even if 
time-stretching techniques were applied to the signal 
or small snippets are replayed instead of single 
samples. On the other hand, such a slider-like inter - 



Figure 1. An audio player interface with speed controller and audio-progress bar 
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Figure 2. FineSlider (a): The scrolling speed depends on the distance between the slider thumb and 
mouse pointer, as is shown in the distance-to-speed mapping (b) 
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face — in which navigation is based on a modification 
of the current position within the file — could be very 
useful, especially in situations where navigation using 
a speed-controller interface is less intuitive and lacks 
flexibility. 

Our approach for improved speech skimming is 
based on the idea of using time stretching in order to 
combine position-based navigation using the audio- 
progress bar with simultaneous audio feedback. It 
builds on the concept of elastic interfaces, which was 
originally introduced by Masui, Kashiwagi, and Borden 
(1995) for navigating and browsing discrete visual 
data. The basic idea of elastic interfaces is not to drag 
and move objects directly but instead to pull them 
along a straight line that connects the mouse pointer 
or cursor with the object (cf Hurst, in press). The 
speed with which the object follows the cursor’s 
movements depends on the distance between the 
cursor and the object: If this distance is large, the 
object moves faster, and if it gets smaller, the object 
slows down. This behavior can be explained with the 
rubber-band metaphor, in which the direct connec- 
tion between the object and the cursor is interpreted 
as a rubber band. Hence, if the cursor is moved away 
from the object, the tension on the rubber band gets 
stronger, thus pulling the object faster toward the 
cursor position. If the object and cursor get closer to 
each other, the rubber band loosens and the force 
pulling the object decreases; that is, its movement 
becomes slower. 

One example for an elastic interface is the so- 
called FineSlider (Masui et al., 1995). Here, the 
distance between the mouse pointer and the slider 
thumb along a regular slider bar is mapped to move- 
ments of the slider thumb using a linear mapping 



function (see Figure 2). This distance can be inter- 
preted as a rubber band, as described above. The 
main advantage of the FineSlider is that it allows a 
user to access any position within a document 
independent of its length, which is not necessarily 
true for regular sliders. Since the slider length 
represents the size of the document, moving the 
slider thumb one pixel might already result in a large 
jump in the file. This is because the number of pixels 
on the slider (and thus the number of positions it can 
access in the document) is limited by the corre- 
sponding window size and screen resolution, while 
on the other hand, the document can be arbitrarily 
long. By choosing an appropriate distance-to-speed 
mapping for the FineSlider, small distances be- 
tween the slider thumb and the mouse pointer can 
be mapped to scrolling speeds that would otherwise 
only be possible with subpixel movements of the 
thumb. The FineSlider thus enables access to any 
random position of the file independent of its actual 
length. 

If applied to a regular audio-progress bar, the 
concept of elastic interfaces offers two significant 
advantages. First, the thumb’s movements are no 
longer mapped directly to the corresponding posi- 
tions in the document, but are only considered 
indirectly via the distance-to-speed mapping. With 
regard to speech replay, this can be used to restrict 
scrolling speed and thus replay rates to a range in 
which audio feedback is still understandable and 
useful, such as 0.5 to 3.0 times the normal replay. 
Second, the motion of the thumb is much smoother 
because it is no longer directly controlled by the 
jerky movements of the mouse pointer (or the 
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user’s hand), thus resulting in a more reasonable, 
comprehensible audio feedback. 

However, transferring the concept of elastic 
interfaces from visual to audio data is not straight- 
forward. In the visual case (compare Figure 2b), a 
still image is considered the basic state while the 
mouse pointer is in the area around the slider thumb. 
Moving the pointer to the right continuously in- 
creases the forward replay speed, while moving it to 
the left enables backward playing with an increased 
replay rate. In contrast to this, there is no still or 
static state of a speech signal. In addition, strict 
backward playing of a speech signal makes no sense 
(unless it is done in a modified way such as realized 
in Arons, 1994). Therefore, we propose to adapt the 
distance-to-speed mapping for audio replay as illus- 
trated in Figure 3a. Moving the pointer to the left 
enables time-expanded replay until the lower border 
of 0.5 times the normal replay speed is reached. 
Moving it to the right continuously increases replay 
speed up to a bound of 3.0 times the normal replay 
speed. The area around the slider thumb represents 
normal replay. 

This functionality can be integrated into the au- 
dio-progress bar in order to create an elastic audio 
slider in the following way: If a user clicks anywhere 
on the bar’s scale to the right of the current position 
of the thumb and subsequently moves the pointer to 
the right, replay speed is increased by the value 
resulting from the distance-to-speed mapping illus- 
trated in Figure 3a. Moving the pointer to the left 
reduces replay speed accordingly. Placing or mov- 
ing the pointer to the left of the actual position of the 
thumb results in an analogous but time-expanding 



replay behavior. The rubber-band metaphor still 
holds for this modification when the default state is 
not a pause mode but playback (i.e., slider move- 
ment) at normal speed, which can either be in- 
creased or decreased. The force of the rubber band 
pulls the slider thumb toward the right or left, thereby 
accelerating or braking it, respectively. Increasing 
the tension of the rubber band by dragging the mouse 
pointer further away from the thumb increases the 
force and thus the speed changes. If the band is 
loosened, the tension of the band and the accelerat- 
ing or decelerating force on the slider is reduced. 

The visualization of this functionality in the 
progress bar is illustrated in Figure 4. As soon as the 
user presses the mouse button anywhere on the 
slider scale, three areas of different color indicate 
the replay behavior. The purple area around the 
slider thumb represents normal speed. Green, to the 
right of the thumb, indicates accelerated playback 
(as in “Green light: Speed up !”), while the red color 
on the left stands for slower replay (as in “Red light: 
Slow down!”). A tool tip attached to the mouse 
pointer displays the current speedup factor. Further 
details on the design decisions and the technical 
implementation of the interface can be found in 
Hurst, Lauer, and Gotz (2004b). 

With such an elastic audio slider, users can 
quickly speed up replay or slow it down, depending 
on the current situation and demand. On the other 
hand, a traditional speed-controller design is better 
suited for situations in which the aim is not to 
interactively modify replay speed but instead to 
continuously listen to a speech recording at a differ- 
ent but fixed speed over a longer period. Both cases 



Figure 3. Distance-to-speed mapping for the elastic audio slider without a speed controller (a) and 
coupled with the speed controller (b; with basic speed set to 1.5) 
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Figure 4. Visualization of the elastic audio slider (a), with views for normal (b), time-expanded (c), 
and time-compressed replay (d) 




are very important applications of time stretching. 
Amir et al. (2000) present a study in which users’ 
assessments of different replay speeds identified a 
natural speed for each file used in the study that the 
participants considered as the optimal speedup ratio. 
In addition, they identified unified speedup assess- 
ments once replay was normalized, that is, after 
replay was set to the natural speed of the corre- 
sponding file. Amir et al.’s study only gives initial 
evidence that must be confirmed through further 
user testing. Nonetheless, these findings argue for a 
combination of the speed controller with the elastic 
audio-slider functionality. The speed controller can 
be used to select a new basic speed. As a conse- 
quence, the area around the thumb in the audio- 
progress bar no longer represents the normal replay 
rate, but is adapted to the new basic speed. The area 
to the left and right of the thumb are adapted 
accordingly, as illustrated in the distance-to-speed 
mapping shown in Figure 3b. This modification en- 
ables users to select a basic replay speed that is 
optimal for the current document as well as for their 
personal preferences. Using the elastic audio slider, 
they can subsequently react directly to events in the 
speech signal and interactively speed up or slow 
down audio replay. Hence, interaction becomes 
more similar to navigation by using regular scroll 
bars or sliders and thus is more useful for interactive 
tasks such as information seeking. 

FUTURE TRENDS 

Our current work includes integrating additional 
time-compression techniques, such as pause reduc- 



tion, into the system. Maybe the most challenging 
and exciting opportunity for future research will be 
the combination of the elastic audio-skimming func- 
tionality with the opportunity to skim visual data 
streams at the same time, thus providing real 
multimodal data browsing. While we already proved 
the usefulness of elastic browsing for video data 
(Hurst, Gotz, & Lauer, 2004a), combined audiovi- 
sual browsing raises a whole range of new questions 
regarding issues such as how to handle the different 
upper and lower speedup bounds for audio and video, 
whether (and how) to provide audio feedback during 
visual skimming in the reverse direction, and how to 
maintain synchronized replay if pause reduction is 
used. 



CONCLUSION 

Common interface designs incorporating time- 
stretched speech replay in current media players are 
very useful to set the replay speed to a higher or 
lower rate. However, they lack the flexibility and 
intuitiveness needed for highly interactive tasks in 
which continuous changes of the replay speed, such 
as temporary speedups, are needed. For this reason, 
we introduced the elastic audio slider, a new inter- 
face design for the interactive manipulation of speech 
replay using time-stretching techniques. The pro- 
posed approach enhances the functionality of exist- 
ing audio interfaces in a natural and intuitive way. Its 
integration into the audio-progress bar does not 
replace any of the typical features and functionalities 
offered by this widget, and it can be coupled in a 
gaining way with the commonly used tool for time 



359 







Interactive Speech Skimming via Time-Stretched Audio Replay 



stretching, that is, a speed-controller interface. 
Hence, it seamlessly integrates into common user- 
interface designs. The feasibility and usefulness of 
the proposed interface design was verified in a 
qualitative user study presented in Hurst, Lauer, and 
Gotz (2004a). It proved that the elastic audio slider 
is a very useful tool for quickly speeding up audio 
replay (for example, to easily skip a part of minor 
interest) as well as to slow down replay temporarily 
(for example, to listen to a particular part more 
carefully). With the elastic audio slider, users are 
able to react quickly to events in the audio signal 
such as irrelevant passages or important parts and 
adapt the speed temporarily to the situation. This 
facilitates highly interactive tasks such as skimming 
and search. 
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KEY TERMS 

Content Compression: A term that describes 
approaches in which parts of a continuous media file 
are removed in order to speed up replay and data 
browsing or to automatically generate summaries or 
abstracts of the file. In relation to speech signals, 
content-compression techniques often shorten the 
signals by removing parts that have been identified 
as less relevant or unimportant based on pause 
detection and analysis of the emphasis used by the 
speakers. 

Elastic Audio Slider: An interface design that 
enables the interactive manipulation of audio replay 
speed by incorporating the concept of elastic inter- 
faces in common audio-progress bars. 

Elastic Interfaces: Interfaces or widgets that 
manipulate an object, for example, a slider thumb, 
not by direct interaction but instead by moving it 
along a straight line that connects the object with the 
current position of the cursor. Movements of the 
object are a function of the length of this connection, 
thus following the rubber-band metaphor. 

Rubber-Band Metaphor: A metaphor that is 
often used to describe the behavior of two objects 
that are connected by a straight line, the rubber band, 
in which one object is used to pull the other one 
toward a target position. The moving speed of the 
pulled objects depends on the length of the line 
between the two objects, that is, the tension on the 
rubber band. Longer distances result in faster move- 
ments, and shorter distances in slower movements. 



SpeechSkimmer: A system developed by Barry 
Arons at the beginning of the ’90s with the aim of 
making speech skimming as easy as scanning printed 
text. For this, its interface offers various options to 
modify replay speed, especially by applying time- 
stretching and content-compression techniques. 

Speech Skimming: A term, sometimes also 
referred to as speech browsing or scanning, that 
describes the actions involved in skimming through a 
speech recording with the aim of classifying the 
overall topic of the content or localizing some par- 
ticular information within it. 

Time Compression: A term that describes the 
faster replay of continuous media files, such as audio 
or video signals. In the context of speech recordings, 
time compression usually assumes that special tech- 
niques are used to avoid pitch shifting, which other- 
wise results in unpleasant, very high voices. 

Time Expansion: A term that describes the 
slower replay of continuous media files, such as 
audio or video signals. In the context of speech 
recordings, time expansion usually assumes that 
special techniques are used to avoid pitch shifting, 
which otherwise results in unpleasant, very low 
voices. 

Time Stretching: Sometimes also referred to 
as time scaling, this term is often used to embrace 
techniques for the replay of continuous media files 
using time compression and time expansion, particu- 
larly in relation to speech signals in which faster or 
slower replay is achieved in a way that preserves the 
overall characteristics of the respective voices, such 
as pitch and timbre. 
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INTRODUCTION 

The last 20 years have seen the development of a 
wide range of standards related to HCI (human- 
computer interaction). The initial work was by the 
ISO TC 159 ergonomics committee (see Stewart, 
2000b), and most of these standards contain general 
principles from which appropriate interfaces and 
procedures can be derived. This makes the stan- 
dards authoritative statements of good professional 
practice, but makes it difficult to know whether an 
interface conforms to the standard. Reed et al. 
(1999) discuss approaches to conformance in these 
standards. 

ISO/IEC JTC1 has established SC35 for user 
interfaces, evolving out of work on keyboard lay- 
outs. This group has produced standards for icons, 
gestures, and cursor control, though these do not 
appear to have been widely adopted. 

More recently, usability experts have worked 
with the ISO/IEC JTC1 SC7 software-engineering 
subcommittee to integrate usability into software 
engineering and software-quality standards. This 
has required some compromises: for example, rec- 
onciling different definitions of usability by adopting 
the new term quality in use to represent the ergo- 
nomic concept of usability (Bevan, 1999). 



It is unfortunate that at a time of increasing 
expectations of easy access to information via the 
Internet, international standards are expensive and 
difficult to obtain. This is an inevitable consequence 
of the way standards bodies are financed. Information 
on how to obtain standards can be found in Table 4. 



TYPES OF STANDARDS FOR HCI 

Standards related to usability can be categorised as 
primarily concerned with the following. 

1. The use of the product (effectiveness, effi- 
ciency, and satisfaction in a particular context 
of use) 

2. The user interface and interaction 

3. The process used to develop the product 

4. The capability of an organisation to apply user- 
centred design 

Figure 1 illustrates the logical relationships: The 
objective is for the product to be effective, efficient, 
and satisfying when used in the intended contexts. A 
prerequisite for this is an appropriate interface and 
interaction. This requires a user-centred design pro- 
cess, which to be achieved consistently, requires an 



Figure 1. Categories of standards 
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organisational capability to support user-centred 
design. 

DEVELOPMENT OF ISO 
STANDARDS 

International standards for HCI are developed under 
the auspices of the International Organisation for 
Standardisation (ISO) and the International 
Electrotechnical Commission (IEC). ISO and IEC 
comprise national standards bodies from member 
states. The technical work takes place in working 
groups of experts, nominated by national standards 
committees. 

The standards are developed over a period of 
several years, and in the early stages, the published 
documents may change significantly from version to 
version until consensus is reached. As the standard 
becomes more mature, from the committee-draft 
stage onward, formal voting takes place by partici- 
pating national member bodies. 

The status of ISO and IEC documents is 
summarised in the title of the standard, as described 
in Table 1, and Table 2 shows the main stages of 
developing an international standard. 

STANDARDS DESCRIBED IN THIS 
ARTICLE 

Table 3 lists the international standards and techni- 
cal reports related to HCI that were published or 
under development in 2004. The documents are 



divided into two categories : those containing general 
principles and recommendations, and those with 
detailed specifications. They are also grouped ac- 
cording to subject matter. All the standards are 
briefly described in Table 3. 

APPROACHES TO HCI STANDARDS 

HCI standards have been developed over the last 20 
years. One function of standards is to impose consis- 
tency, and some attempt has been made to do this by 
ISO/IEC standards for interface components such 
as icons, PDA (personal digital assistant) scripts, 
and cursor control. However, in these areas, de 
facto industry standards have been more influential 
than ISO, and the ISO standards have not been 
widely adopted. 

The ISO 9241 standards have had more impact 
(Stewart, 2000b; Stewart & Travis, 2002). Work on 
ergonomic requirements for VDT workstation hard- 
ware and the environment (ISO 9241, parts 3-9) 
began in 1983, and was soon followed by work on 
guidelines for the software interface and interaction 
(parts 10-17). The approach to software in ISO 9241 
is based on detailed guidance and principles for 
design rather than precise interface specifications, 
thus permitting design flexibility. 

More recently, standards and metrics for soft- 
ware quality have been defined by the software- 
engineering community. 

The essential user-centred design activities needed 
to produce usable products are described in the 
ergonomic standard ISO 13407. These principles 



Table 1. ISO and IEC document titles 



Example 


Explanation 


ISO 1234 (2004) 


ISO Standard 1234. published in 2004 


ISO 1234-1 (2004) 


Part 1 of ISO Standard 1234, published in 2004 


ISO/IEC 1234 (2004) 


Joint ISO/IEC Standard 1234, published in 2004 


ISO TS 1234 (2004) 


An ISO technical specification: A normative document that may later be 
revised and published as a standard 


ISO PAS 1234 (2004) 


An ISO publicly available specification: A normative document with 
less agreement than a TS that may later be revised and published as a 
standard 


ISO TR 1234 (2004) 


An ISO technical report: An informative document containing 
information of a different kind from that normally published in a 
normative standard 


ISO xy 1234 (2004) 


A draft standard of document type xx (see Table 2) 
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Table 2. Stages of development of draft ISO documents 



Stage 


Document type 




Description 


1 


AWI 


Approved work item 


Prior to a working draft 


2 


WD 


Working draft 


Preliminary draft for discussion by 
working group 


3 


CD 


Committee draft 


Complete draft for vote and technical 
comment by national bodies 


CD TR or TS 


Committee draft technical report/specification 


4 


CDV 


Committee draft for vote (IEC) 


Final draft for vote and editorial comment 
by national bodies 


DIS 


Draft international standard 


FCD 


Final committee draft (JTC1) 


DTR or DTS 


Draft technical report/specification 


5 


FDIS 


Final draft international standard 


Intended text for publication for final 
approval 



Table 3. Standards described in this article 



Section 


Principles and recommendations 


Specifications 


1. Context 
and test 
methods 


ISO/IEC 9126-1: Software engineering - Product 
quality - Quality model 


ISO DIS 20282-1 : Ease of operation of everyday 
products - Context of use and user characteristics 


ISO/IEC TR 9126-4: Software engineering - 
Product quality - Quality-in-use metrics 


ISO DTS 20282-2: Ease of operation of everyday 
products - Test method 


ISO 9241-1 1: Guidance on usability 


ANSI/NCITS 354: Common Industry Format for 
usability test reports 


ISO/IEC PDTR 19764: Guidelines on 
methodology, and reference criteria for cultural 
and linguistic adaptability in information- 
technology products 


Draft Common Industry Format for usability 
requirements 


2. 

Software 

interface 

and 

interac- 

tion 


ISO/IEC TR 9126-2: Software engineering - 
Product quality - External metrics 


ISO/IEC 10741-1: Dialogue interaction - Cursor 
control for text editing 


ISO/IEC TR 9126-3: Software engineering - 
Product quality - Internal metrics 


ISO/IEC 11581: Icon symbols and functions 


ISO 9241 : Ergonomic requirements for office 
work with visual display terminals. Parts 10-17 


ISO/IEC 18021: Information technology - User 
interface for mobile tools 


ISO 14915: Software ergonomics for multimedia 
user interfaces 


ISO/IEC 18035: Icon symbols and functions for 
controlling multimedia software applications 


ISO TS 16071: Software accessibility 


ISO/IEC 18036: Icon symbols and functions for 
World Wide Web browser toolbars 


ISO TR 19765: Survey of existing icons and 
symbols for elderly and disabled persons 


ISO WD nnnn: Screen icons and symbols for 
personal, mobile, communications devices 


ISO TR 19766: Design requirements for icons and 
symbols for elderly and disabled persons 


ISO WD nnnn: Icon symbols and functions for 
multimedia link attributes 


ISO CD 23974: Software ergonomics for World 
Wide Web user interfaces 


ISO/IEC 25000 series: Software product-quality 
requirements and evaluation 


IEC TR 6 1 997 : Guidelines for the user interfaces 
in multimedia equipment for general-purpose use 




3. Hard- 
ware 
interface 


ISO 1 1064: Ergonomic design of control centres 


ISO 9241 : Ergonomic requirements for office 
work with visual display terminals. Parts 3-9 


ISO/IEC CDTR 15440: Future keyboards and 
other associated input devices and related entry 
methods 


ISO 13406: Ergonomic requirements for work 
with visual displays based on flat panels 




ISO/IEC 14754: Pen-based interfaces - Common 
gestures for text editing with pen-based systems 


4. 

Develop- 

ment 

process 


ISO 13407: Human-centred design processes for 
interactive systems 


ISO/IEC 14598: Information technology - 
Evaluation of software products 


ISO TR 16982: Usability methods supporting 
human-centred design 




5. Usability 
capability 


ISO TR 18529: Human-centred life-cycle process 
descriptions 




ISO PAS 18152: A specification for the process 
assessment of human-system issues 


6. Other 

related 

standards 


ISO 9241-1 : General introduction 




ISO 9241-2: Guidance on task requirements 


ISO 10075-1: Ergonomic principles related to 
mental workload - General terms and definitions 
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have been refined and extended in a model of 
usability maturity that can be used to assess the 
capability of an organisation to carry out user- 
centred design (ISO TR 18529). Burmester and 
Machate (2003) and Reed et al. (1999) discuss how 
different types of guidelines can be used to support 
the user-centred development process. 

STANDARDS 

Use in Context and Test Methods 

1. ISO 9241-11: Guidance on Usability 
(1998): This standard (which is part of the ISO 
924 1 series) provides the definition of usability 
that is used in subsequent related ergonomic 
standards. 

• Usability: The extent to which a product 
can be used by specified users to achieve 
specified goals with effectiveness, effi- 
ciency, and satisfaction in a specified con- 
text of use 

2. ISO/IEC 9126-1: Software Engineering — 
Product Quality — Part 1: Quality Model 
( 2001 ): This standard describes six categories 
of software quality that are relevant during 
product development including quality in use 
(similar to the definition of usability in ISO 
9241-1 1), with usability defined more narrowly 
as ease of use (Bevan, 1999). 

3. ISO/IEC CD TR 19764: Guidelines on 
Methodology, and Reference Criteria for 
Cultural and Uinguistic Adaptability in In- 
formation-Technology Products (2003): 
This defines a methodology and a guided check- 
list for the evaluation of cultural adaptability in 
software, hardware, and other IT products. 

4. ISO/IEC TR 9126-4: Software Engineer- 
ing — Product Quality — Part 4: Quality-in- 
Use Metrics (2004): Contains examples of 
metrics for effectiveness, productivity, safety, 
and satisfaction. 

5. ISO 20282: Ease of Operation of Everyday 
Products: Ease of operation is concerned with 
the usability of the user interface of everyday 
products. 

• Part 1: Context of Use and User Charac- 
teristics (DIS: 2004): This part explains 



how to identify which aspects are relevant 
in the context of use and describes how to 
identify the characteristics that cause vari- 
ance within the intended user population. 

• Part 2: Test Method (DTS: 2004): This 
specifies a test method for measuring the 
ease of operation of public walk-up-and- 
use products and of everyday consumer 
products. 

6. Common Industry Format 

• ANSI/NCITS 354: Common Industry For- 
mat for Usability Test Reports (2001): 
This specifies a format for documenting 
summative usability test reports for use in 
contractual situations, and is expected to 
become an ISO standard (Bevan, Claridge, 
Maguire, & Athousaki, 2002). 

• Draft Common Industry Format for Us- 
ability Requirements (2004): Specifies a 
format for documenting summative usabil- 
ity requirements to aid communication early 
in development, and is expected to be- 
come an ISO standard. 

Software Interface and Interaction 

These standards can be used to support user-inter- 
face development in the following ways. 

• To specify details of the appearance and 
behaviour of the user interface. ISO 14915 and 
IEC 61997 contain recommendations for mul- 
timedia interfaces. More specific guidance can 
be found for icons in ISO/IEC 1 1581, PDAs in 
ISO/IEC 18021, and cursor control in ISO/IEC 
10741. 

• To provide detailed guidance on the design of 
user interfaces (ISO 9241, parts 12-17). 

• To provide criteria for the evaluation of user 
interfaces (ISO/IEC 9126, parts 2 and 3). 

However, the attributes that a product requires 
for usability depend on the nature of the user, task, 
and environment. ISO 9241-1 1 can be used to help 
understand the context in which particular attributes 
may be required. Usable products can be designed 
by incorporating product features and attributes 
known to benefit users in particular contexts of use. 
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1. ISO 9241: Ergonomic Requirements for 
Office Work with Visual Display Termi- 
nals: ISO 9241 parts 10, and 12 to 17 provide 
requirements and recommendations relating to 
the attributes of the software. 

• Part 10: Dialogue Principles (1996): This 
contains general ergonomic principles that 
apply to the design of dialogues between 
humans and information systems : suitabil- 
ity for the task, suitability for learning, 
suitability for individualisation, conformity 
with user expectations, self-descriptive- 
ness, controllability, and error tolerance. 

• Part 1 2 : Presentation of Information (1998): 
This part includes guidance on ways of 
representing complex information using 
alphanumeric and graphical or symbolic 
codes, screen layout, and design, as well 
as the use of windows. 

• Part 13: User Guidance (1998): Part 13 
provides recommendations for the design 
and evaluation of user-guidance attributes 
of software user interfaces including 
prompts, feedback, status, online help, and 
error management. 

• Part 14: Menu Dialogues (1997): It pro- 
vides recommendations for the design of 
menus used in user-computer dialogues, 
including menu structure, navigation, op- 
tion selection and execution, and menu 
presentation. 

• Part 15: Command Dialogues (1997): It 
provides recommendations for the design 
of command languages used in user-com- 
puter dialogues, including command-lan- 
guage structure and syntax, command rep- 
resentations, input and output consider- 
ations, and feedback and help. 

• Part 16: Direct Manipulation Dialogues 
(1999): This provides recommendations 
for the design of direct-manipulation dia- 
logues, and includes the manipulation of 
objects and the design of metaphors, ob- 
jects, and attributes. 

• Part 17: Form-Filling Dialogues (1998): It 
provides recommendations for the design 
of form-filling dialogues, including form 
structure and output considerations, input 
considerations, and form navigation. 



2. ISO/IEC 9126: Software Engineering — 
Product Quality: ISO/IEC 9126-1 defines 
usability in terms of understandability, 
learnability, operability, and attractiveness. 
Parts 2 and 3 include examples of metrics for 
these characteristics. 

• Part 2: External Metrics (2003): Part 2 
describes metrics that can be used to 
specify or evaluate the behaviour of the 
software when operated by the user. 

• Part 3: Internal Metrics (2003): Part 3 
describes metrics that can be used to 
create requirements that describe static 
properties of the interface that can be 
evaluated by inspection without operating 
the software. 

3. Icon Symbols and Functions 

• ISO/IEC 1 1581: Icon Symbols and Func- 
tions 

• Part 1 : Icons — General (2000): This 
part contains a framework for the devel- 
opment and design of icons, including gen- 
eral requirements and recommendations 
applicable to all icons. 

• Part 2: Object Icons (2000) 

• Part 3: Pointer Icons (2000) 

• Part 4: Control Icons (CD: 1999) 

• Part 5: Tool Icons (2004) 

• Part 6: Action Icons (1999) 

• ISO/IEC 18035: Icon Symbols and Func- 
tions for Controlling Multimedia Software 
Applications (2003): This describes user 
interaction with and the appearance of 
multimedia control icons on the screen. 

• ISO/IEC 18036: Icon Symbols and Func- 
tions for World Wide Web Browser 
Toolbars (2003): This describes user inter- 
action with and the appearance of World 
Wide Web toolbar icons on the screen. 

• ISO WD nnnn: Screen Icons and Symbols 
for Personal Mobile Communications De- 
vices (2004): It defines a set of display- 
screen icons for personal mobile commu- 
nication devices. 

• ISO WD nnnn: Icon Symbols and Func- 
tions for Multimedia Link Attributes (2004) : 
It describes user interaction with and the 
appearance of link attribute icons on the 
screen. 
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• ISO CD TR 19765: Survey of Existing 
Icons and Symbols for Elderly and Dis- 
abled Persons (2003) : It contains examples 
of icons for features and facilities used by 
people with disabilities. 

• ISO TR 19766: Design Requirements for 
Icons and Symbols for Elderly and Dis- 
abled Persons 

4. ISO 14915: Software Ergonomics for Mul- 
timedia User Interfaces: 

• Part 1 : Design Principles and Framework 
(2002) : This part provides an overall intro- 
duction to the standard. 

• Part 2 : Multimedia Control and N avigation 
(2003): This part provides recommenda- 
tions for navigation structures and aids, 
media controls, basic controls, media-con- 
trol guidelines for dynamic media, and 
controls and navigation involving multiple 
media. 

• Part 3: Media Selection and Combination 
(2002): This part provides general guide- 
lines for media selection and combination, 
media selection for information types, 
media combination and integration, and 
directing users’ attention. 

• Part 4: Domain-Specific Multimedia In- 
terfaces (AWI): This part is intended to 
cover computer-based training, computer- 
supported cooperative work, kiosk sys- 
tems, online help, and testing and evalua- 
tion. 

5. IEC TR 61997: Guidelines for the User 
Interfaces in Multimedia Equipment for 
General-Purpose Use (2001): This gives 
general principles and detailed design guidance 
for media selection, and for mechanical, graphi- 
cal, and auditory user interfaces. 

6. ISO CD 23974: Software Ergonomics for 
World Wide Web User Interfaces (2004): 
It provides recommendations and guidelines 
for the design of Web user interfaces. 

7. ISO/IEC 18021: Information Technology - 
User Interface for Mobile Tools for Man- 
agement of Database Communications in a 
Client-Server Model (2002): This standard 
contains user-interface specifications for PDAs 
with data-interchange capability with corre- 
sponding servers. 



8 . ISO/IEC 10741-1: Dialogue Interaction — 
Cursor Control for Text Editing (1995): 

This standard specifies how the cursor should 
move on the screen in response to the use of 
cursor control keys. 

Hardware Interface 

1. ISO 9241: Ergonomic Requirements for 
Office Work with Visual Display Termi- 
nals: Parts 3 to 9 contain hardware design 
requirements and guidance. 

• Part 3: Visual Display Requirements 
(1992): This specifies the ergonomics re- 
quirements for display screens that ensure 
that they can be read comfortably, safely, 
and efficiently to perform office tasks. 

• Part 4: Keyboard Requirements (1998): 
This specifies the ergonomics design char- 
acteristics of an alphanumeric keyboard 
that may be used comfortably, safely, and 
efficiently to perform office tasks. Key- 
board layouts are dealt with separately in 
various parts of ISO/IEC 9995: Informa- 
tion Processing - Keyboard Layouts for 
Text and Office Systems (1994). 

• Part 5: Workstation Layout and Postural 
Requirements (1998): It specifies the er- 
gonomics requirements for a workplace 
that will allow the user to adopt a comfort- 
able and efficient posture. 

• Part 6: Guidance on the Work Environ- 
ment (1999): This part provides guidance 
on the working environment (including 
lighting, noise, temperature, vibration, and 
electromagnetic fields) that will provide 
the user with comfortable, safe, and pro- 
ductive working conditions. 

• Part 7: Requirements for Display with 
Reflections (1998): It specifies methods of 
measuring glare and reflections from the 
surface of display screens to ensure that 
antireflection treatments do not detract 
from image quality. 

• Part 8: Requirements for Displayed 
Colours (1997): It specifies the require- 
ments for multicolour displays. 

• Part 9: Requirements for Nonkeyboard 
Input Devices (2000): This specifies the 
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ergonomics requirements for nonkeyboard 
input devices that may be used in conjunc- 
tion with a visual display terminal. 

2. ISO 13406: Ergonomic Requirements for 
Work with Visual Displays Based on Flat 
Panels 

• Part 1: Introduction (1999) 

• Part 2: Ergonomic Requirements for Flat- 
Panel Displays (2001) 

3. ISO/IEC 14754: Pen-Based Interfaces — 
Common Gestures for Text Editing with 
Pen-Based Systems (1999) 

4. ISO/IEC CD TR 15440: Future Keyboards 
and Other Associated Input Devices and 
Related Entry Methods (2003) 

5. ISO 11064: Ergonomic Design of Control 
Centres: This eight-part standard contains 
ergonomic principles, recommendations, and 
guidelines. 

• Part 1 : Principles for the Design of Control 
Centres (2000) 

• Part 2: Principles of Control-Suite Ar- 
rangement (2000) 

• Part 3 : Control-Room Layout ( 1 999) 

• Part 4: Workstation Layout and Dimen- 
sions (2004) 

• Part 5: Human-System Interfaces (FCD: 

2002) 

• Part 6: Environmental Requirements for 
Control Rooms (DIS: 2003) 

• Part 7: Principles for the Evaluation of 
Control Centres (DIS: 2004) 

• Part 8: Ergonomic Requirements for Spe- 
cific Applications (WD: 2000) 

The Development Process 

ISO 13407 explains the activities required for user- 
centred design, and ISO 16982 outlines the types of 
methods that can be used. ISO/IEC 14598 gives a 
general framework for the evaluation of software 
products using the model in ISO/IEC 9126-1. 

1. ISO 13407: Human-Centred Design Pro- 
cesses for Interactive Systems (1999) 

2. ISO TR 16982: Usability Methods Sup- 
porting Human-Centred Design (2002) 



3. ISO/IEC 14598: Information Technology — 
Evaluation of Software Products (1998- 
2000 ) 

Usability Capability of the Organisation 

The usability maturity model in ISO TR 18529 
contains a structured set of processes derived from 
ISO 13407 and a survey of good practice. It can be 
used to assess the extent to which an organisation is 
capable of carrying out user-centred design (Earthy, 
Sherwood Jones, & Bevan, 2001). ISO PAS 18152 
extends this to the assessment of the maturity of an 
organisation in performing the processes that make 
a system usable, healthy, and safe. 

• ISOTR 18529: Ergonomics of Human-System 
Interaction — Human-Centred Life-Cycle Pro- 
cess Descriptions (2000) 

• ISO PAS 18152: Ergonomics of Human-Sys- 
tem Interaction — A Specification for the Pro- 
cess Assessment of Human-System Issues 
(2003) 

Other Related Standards 

1. ISO 9241-2: Part 2: Guidance on Task 
Requirements (1992) 

2. ISO 10075: Ergonomic Principles Related 
to Mental Workload 

• Part 1: General Terms and Definitions 
(1994) 

• Part 2: Design Principles (1996) 

• Part 3: Principles and Requirements Con- 
cerning Methods for Measuring and As- 
sessing Mental Workload (2004) 

3. ISO TS 16071: Guidance on Accessibility 
for Human-Computer Interfaces (2003): 

This provides recommendations for the design 
of systems and software that will enable users 
with disabilities greater accessibility to com- 
puter systems (see Gulliksen & Harker, 2004). 

6. ISO AWI 9241-20: Accessibility Guide- 
line for Information Communication Equip- 
ment and Services: General Guidelines 
(2004) 
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WHERE TO OBTAIN 
INTERNATIONAL STANDARDS 

ISO standards have to be purchased. They can be 
obtained directly from ISO, or from a national stan- 
dards body (Table 4). 

FUTURE OF HCI STANDARDS 

Now that the fundamental principles have been 
defined, the ergonomics and software-quality stan- 
dards groups are consolidating the wide range of 
standards into more organised collections. Some of 
the new series are already approved work items, 
CDs, or DISs. 

ISO 9241 : Ergonomics of Human-System 
Interaction 

The parts of ISO 9241 are in the process of being 
revised into the new structure shown below. 

• Part 1 : Introduction 

• Part 2: Job Design 

• Part 11: Hardware & Software Usability 

• Part 20: Accessibility and Human-System In- 

teraction (AWI: 2004) 

Software 

• Part 100: Introduction to Software Ergonomics 

• Part 1 10: Dialogue Principles (DIS: 2004, revi- 
sion of ISO 9241-10) 

• Part 112: Presentation Principles and Recom- 
mendations (part of ISO 9241-12) 

• Part 1 13: User Guidance (ISO 9241-13, refer- 
ence to ISO/IEC 18019) 

• Part 1 14: Multimedia Principles (ISO 14915-1) 

• Part 115: Dialogue Navigation (part of ISO 
14915-2, reference to ISO/IEC 18035) 



• Part 120: Software Accessibility (ISO/TS 
16071) 

• Part 130: GUI (graphical user interface) & 
Controls (does not yet exist, reference to ISO/ 
IEC 11581) 

• Part 131: Windowing Interfaces (part of ISO 
9241-12) 

• Part 132: Multimedia Controls (part of ISO 
14915-2) 

• Part 140: Selection and Combination of Dia- 
logue Techniques (part of ISO 9241-1) 

• Part 141: Menu Dialogues (ISO 9241-14) 

• Part 142: Command Dialogues (ISO 9241-15) 

• Part 143: Direct-Manipulation Dialogues (ISO 
9241-16) 

• Part 144: Form-Filling Dialogues (ISO 9241- 
17) 

• Part 145: Natural-Language Dialogues 

• Part 150: Media (ISO 14915-3) 

• Part 160: Web Interfaces (ISO 23973, refer- 
ence to ISO/IEC 18036) 

Process 

• Part 200: Human-System Interaction Processes 

• Part 2 10: Human-Centred Design (ISO 13407) 

• Part 211: HSL 16982, ISO PAS 18152 

Ergonomic Requirements and Measurement 

Techniques for Electronic Visual Displays 

• Part 301 : Introduction (CD: 2004) 

• Part 302: Terminology (CD: 2004) 

• Part 303 : Ergonomics Requirements (CD : 2004) 

• Part 304: User-Performance Test Method 
(AWI: 2004) 

• Part 305: Optical Laboratory Test Methods 
(CD: 2004) 

• Part 306: Field Assessment Methods (CD: 
2004) 



Table 4. Sources of standards and further information 



Information 


URL (uniform resource locator) 


Published ISO standards, and the status of 
standards under development 


http://www.iso.org/iso/en/ 

Standards_Search.StandardsQueryForm 


ISO national member bodies 


http://www.iso.ch/addresse/membodies.html 


NSSN, a national resource for global 
standards 


http://www.nssn.org 
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• Part 307 : Analysis and Compliance Test Meth- 
ods (CD: 2004) 

Physical Input Devices 

• Part 400: Ergonomic Principles (CD: 2004) 

• Part 410: Design Criteria for Products (AWI: 
2004) 

• Part 411: Laboratory Test and Evaluation Meth- 
ods 

• Part 420: Ergonomic Selection Procedures 
(AWI: 2004) 

• Part 421: Workplace Test and Evaluation Meth- 
ods 

• Part 500: Workplaces 

• Part 600: Environment 

• Part 700: Special Application Domains 

• Part 710: Control Centre (in seven parts) 

ISO/IEC 25000 Series: Software Product- 
Quality Requirements and Evaluation 
(SQuaRE) 

The ISO/IEC 25000 series of standards will replace 
and extend ISO/IEC 9126, ISO/IEC 14598, and the 
Common Industry Format. 

1. ISO/IEC FCD 25000: Guide to SquaRE 
(2004) 

• ISO/IEC AWI 25001 : Planning and Man- 
agement (ISO/IEC 14598-2) 

• ISO/IEC AWI 25010: Quality Model and 
Guide (ISO/IEC 9126-1) 

2. ISO/IEC CD 25020: Measurement Refer- 
ence Model and Guide (2004) 

3. ISO/IEC CD 25021: Measurement Primi- 
tives (2004) 

• ISO/IEC AWI 25022: Measurement of 
Internal Quality (ISO/IEC 9126-3) 

• ISO/IEC AWI 25023: Measurement of 
External Quality (ISO/IEC 9126-2) 

• ISO/IEC AWI 25024: Measurement of 
Quality in Use (ISO/IEC 9126-3) 

4. ISO/IEC CD 25030: Quality Requirements 
and Guide (2004) 

• ISO/IEC AWI 25040: Quality Evaluation- 
Process Overview & Guide (ISO/IEC 
14598-1) 



• ISO/IEC AWI 25041: Evaluation Modules 
(ISO/IEC 14598-6) 

• ISO/IEC AWI 25042: Process for Devel- 
opers (ISO/IEC 14598-3) 

• ISO/IEC AWI 25043: Process for 
Acquirers (ISO/IEC 14598-4) 

• ISO/IEC AWI 25044: Process for Evalu- 
ators (ISO/IEC 14598-5) 

• ISO/IEC 2505 1 : Quality Requirements and 
Testing Instructions for Commercial Off- 
the-Shelf (COTS) Software (ISO/IEC 
12119) 

• ISO/IEC 250nn: Common Industry For- 
mat (ANSI/NCITS 354) 



CONCLUSION 

The majority of effort in ergonomics standards has 
gone into developing conditional guidelines (Reed et 
al., 1999), following the pioneering work of Smith 
and Mosier (1986). Parts 12 to 17 of ISO 9241 
contain a daunting 82 pages of guidelines. These 
documents provide an authoritative source of refer- 
ence, but designers without usability experience 
have great difficulty applying these types of guide- 
lines (de Souza & Bevan, 1990; Thovtrup & Nielsen, 
1991). Several checklists have been prepared to 
help assess the conformance of software to the main 
principles in ISO 924 1 (Gediga, Hamborg, & Diintsch, 
1999; Oppermann & Reiterer, 1997; Prumper, 1999). 

In the United States, there is continuing tension 
between producing national standards that meet the 
needs of the large U.S. market and contributing to 
the development of international standards. Having 
originally participated in the development of ISO 
924 1 , the HFES decided to put subsequent effort into 
a national version: HFES- 100 and HFES-200 (see 
Reed et al., 1999). 

Standards are more widely accepted in Europe 
than in the United States, partly for cultural reasons, 
and partly to achieve harmonisation across Euro- 
pean Union countries. Many international standards 
(including ISO 9241) have been adopted as Euro- 
pean standards. The European Union (2004) 
Supplier’ s Directive requires that the technical speci- 
fications used for public procurement must be in the 
terms of any relevant European standards. Ergo- 
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nomic standards such as ISO 9241 can also be used 
to support adherence to European regulations for the 
health and safety of display screens (Bevan, 1991; 
European Union, 1990; Stewart, 2000a). 

Stewart and Travis (2002) differentiate between 
standards that are formal documents published by 
standards-making bodies and developed through a 
consensus and voting procedure, and those that are 
published guidelines that depend on the credibility of 
their authors. This gives standards authority, but it is 
not clear how many of the standards listed in this 
article are widely used. One weakness of most of 
the HCI standards is that they have been discussed 
around a table rather than being developed in a user- 
centred way, testing prototypes during development. 
The U.S. Common Industry Format is an exception, 
undergoing trials during its evolution outside ISO. 
There are ISO procedures to support this, and ISO 
20282 is being issued initially as a technical specifi- 
cation so that trials can be organised before it is 
confirmed as a standard. This is an approach that 
should be encouraged in the future. 

Another potential weakness of international stan- 
dards is that the development process is slow, and 
the content depends on the voluntary effort of 
appropriate experts. Ad hoc groups can move more 
quickly, and when appropriately funded, can pro- 
duce superior results, as with the U.S. National 
Cancer Institute Web design guidelines (Koyani, 
Bailey, & Nall, 2003), which as a consequence may 
remain more authoritative than the forthcoming ISO 
23974. 

Following the trends in software-engineering stan- 
dards, the greatest benefits may be obtained from 
HCI standards that define the development process 
and the capability to apply that process. ISO 13407 
provides an important foundation (Earthy et al., 
2001), and the usability maturity of an organisation 
can be assessed using ISO TR 18529 or ISO PAS 
18152, following the procedure in the software- 
process assessment standard ISO TR 15504-2 
(Sherwood Jones & Earthy, 2003). 
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KEY TERMS 

Context of Use: The users, tasks, equipment 
(hardware, software, and materials), and physical 
and social environments in which a product is used 
(ISO 9241-11). 

Interaction: Bidirectional information exchange 
between users and equipment (IEC 61997). 

Prototype: Representation of all or part of a 
product or system that, although limited in some way, 
can be used for evaluation (ISO 13407). 

Task: The activities required to achieve a goal 
(ISO 9241-11). 

Usability: The extent to which a product can be 
used by specified users to achieve specified goals 
with effectiveness, efficiency, and satisfaction in a 
specified context of use (ISO 9241-1 1). 

User: Individual interacting with the system 
(ISO 9241-10). 

User Interface: The control and information- 
giving elements of a product, and the sequence of 
interactions that enable the user to use it for its 
intended purpose (ISO DIS 20282-1). 
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INTRODUCTION 

As individuals launch themselves into cyberspace 
via networked technologies, they must navigate 
more than just the human-computer interface. The 
rhetoric of the “global village” — a utopian vision of 
a harmonious multicultural virtual world — has tended 
to overlook the messier and potentially much more 
problematic social interfaces of cyberspace: the 
interface of the individual with cyberculture 
(Macfadyen, 2004), and the interface of culture with 
culture. To date, intercultural communications re- 
search has focused primarily on instances of physi- 
cal (face-to-face) encounters between cultural 
groups, for example, in the classroom or in the 
workplace. However, virtual environments are in- 
creasingly common sites of encounter and commu- 
nication for individuals and groups from multiple 
cultural backgrounds. This underscores the need for 
a better understanding of Internet-mediated inter- 
cultural communication. 



BACKGROUND 

Researchers from multiple disciplines (cultural stud- 
ies, intercultural studies, linguistics, sociology, edu- 
cation, human-computer interaction, distance learn- 
ing, learning technologies, philosophy, and others) 
have initiated studies to examine virtual intercultural 
communication. The interdisciplinarity of the field, 
however, offers distinct challenges: in addition to 
embracing different definitions of culture, investiga- 
tors lack a common literature or vocabulary. Com- 
municative encounters between groups and indi- 
viduals from different cultures are variously de- 
scribed as cross-cultural, intercultural, multicultural, 
or even transcultural. Researchers use terms such 
as the Internet, the World Wide Web, cyberspace, 
and virtual (learning) environments (VLEs) to de- 



note overlapping though slightly different perspec- 
tives on the world of networked digital communica- 
tions. Others focus on CMC (computer-mediated 
communication), ICTs (Internet and communication 
technologies), HCI (human-computer interaction), 
CHI (computer-human interaction), or CSC W (com- 
puter-supported cooperative work) in explorations 
of technologies at the communicative interface. 

This article offers an overview of existing theo- 
retical and empirical approaches to examining what 
happens when culturally diverse individuals commu- 
nicate with each other on the Internet: the publicly 
available, internationally interconnected system of 
computers (and the information and services they 
provide to their users) that uses the TCP/IP (trans- 
mission-control protocol/Internet protocol) suite of 
packet-switching communications protocols. 

INVESTIGATING ONLINE 
INTERCULTURAL COMMUNICATION 

Does Culture Influence Internet- 
Mediated Intercultural Communication? 

What does current research tell us about the inter- 
play between individuals, cultures, and communica- 
tion online? A significant number of studies has 
begun to explore online intercultural communica- 
tions between and within selected populations. Some 
have employed quantitative methods to investigate 
whether there are specific cultural differences in 
attitudes to technology and the use of technologies, 
in communication patterns and frequency, and in 
communication style or content (for detailed refer- 
ences to these quantitative studies, see Macfadyen, 
Roche, & Doff, 2004). Others (and especially those 
using qualitative approaches) focus less on the tech- 
nology and instead seek evidence of cultural influ- 
ences on interpersonal or intragroup processes, dy- 
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namics, and communications in cyberspace. For 
example, Chase, Macfadyen, Reeder, and Roche 
(2002) describe nine thematic clusters of apparent 
cultural mismatches that occurred in communica- 
tions between culturally diverse individuals in a 
Web-based discussion forum: differences in the 
choices of participation format and frequency, dif- 
ferences in response to the forum culture, different 
levels of comfort with disembodied communication, 
differing levels of technoliteracy, differences in 
participant expectations, differing patterns of use of 
academic discourse vs. narrative, and differing atti- 
tudes to time and punctuality. To this list of 
discontinuities, Wilson (2001) adds “worldview, cul- 
turally specific vocabulary and concepts, linguistic 
characteristics . . . [and] cognition patterns, including 
reading behaviour” (p. 61). Kim and Bonk (2002) 
report cultural differences in online collaborative 
behaviours, and Rahmati (2000) and Thanasankit 
and Corbitt (2000) describe the different cultural 
values that selected cultural groups refer to in their 
approaches to decision making when working online. 

Evidence is accumulating, then, that seems to 
suggest that cultural factors do impact communica- 
tive encounters in cyberspace. What is the most 
effective framework for exploring and explaining 
this phenomenon, and what role is played by the 
design of human-computer interfaces? 

The Problem of Defining Culture 

Perhaps not surprisingly, most intercultural commu- 
nication researchers have begun by attempting to 
clarify and define what culture is to allow subse- 
quent comparative analyses and examinations of 
cultural differences in communication practices. 
Given that culture “is one of the two or three most 
complicated words in the English language” (Will- 
iams, 1983, p. 87), this definitional quest is, unfortu- 
nately, beset with difficulty. The word itself is now 
used to represent distinct and important concepts in 
different intellectual disciplines and systems of 
thought, and decades of debate between scholars 
across the disciplines have not yielded a simple or 
uncontested understanding of the concept. 

In reality, a majority of existing research and 
theory papers published to date that examine culture 
and communication in online environments implicitly 
define culture as ethnic or national culture, and 



examine online communication patterns among and 
between members of specific ethnic or linguistic 
groups; only a few attempt to broaden the concept of 
culture. Of these, Heaton (1998b) notes, “organiza- 
tional and professional cultures are also vital ele- 
ments in the mix” (pp. 262-263) and defines culture 
as “a dynamic mix of national/geographic, organiza- 
tional and professional or disciplinary variables” (p. 
263). Others highlight the importance of gender 
culture differences in online communications, or 
note the complicating influences of linguistic culture 
and linguistic ability, epistemological type, technical 
skill, literacy (Goodfellow, 2004), class, religion, and 
age (for detailed references, see Macfadyen et al., 
2004). 

The Problem of Essentialism 

Even more problematic than the simplistic equating 
of culture with ethnicity is the persistent and uncriti- 
cal application of essentialist theories of culture and 
cultural difference in intercultural communications 
research. These theories tend to characterize cul- 
ture as an invariant and uncontested matrix of 
meanings and practices that are inherited by and 
shared within a group. They are commonly used 
either to develop testable hypotheses about the 
impact of culture on Internet-mediated intercultural 
communications, or to interpret data post hoc (or 
both). In particular, an increasing number of studies 
relies unquestioningly upon Hofstede’ s (1980, 1991) 
dimensions of (national) culture (Abdat & Pervan, 
2000; Gunawardena, Nolla, Wilson, Lopez-Islas, 
Ramirez- Angel, & Megchun- Alpizar, 200 1 ; Maitland, 
1998; Marcus & Gould, 2000; Tully, 1998) even 
though serious questions have been raised about 
Hofstede ’s methodological assumptions that might 
make his subsequent conclusions less reliable 
(McSweeney, 2002). Also referenced frequently are 
Edward Hall’ s theory (1966) of high- and low-context 
communications (Buragga, 2002; Heaton, 1998a; 
Maitland) and the nationally delineated cultural mod- 
els of Hampden-Turner and Trompenaars (2000). 

Some researchers (Abdelnour-Nocera, 2002; 
Hewling, 2004; Reeder, Macfadyen, Roche, & Chase, 
2004) are now offering critiques of the use of 
essentialist cultural theories in intercultural studies. 
Abdelnour-Nocera discusses, for example, the risks 
of using “ready made cultural models” such as 
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Hofstede’s, arguing that one may miss “qualitative 
specific dimensions that don’t fit certain pre-estab- 
lished parameters” (p. 516). The uncritical use of 
essentialist theories of culture carries with it addi- 
tional and more fundamental problems. First, such 
theories tend to forget that cultures change, and 
instead imagine cultures as static, predictable, and 
unchanging. Second, the assumption of cultures as 
closed systems of meaning tends to ignore important 
questions of power and authority: how has one sys- 
tem of meaning, or discourse, come to dominate? 
Related to this, essentialist theories can sometimes 
be criticized as positioning individuals as simple 
enculturated players, lacking in agency, and do not 
allow for the possibility of choice, learning, and 
adaptation in new contexts. Street (1993) reminds us 
that “culture is not a thing” but that it is often “dressed 
up in social scientific discourse in order to be de- 
fined” (p. 25). Culture is, rather, an active process of 
meaning making: a verb rather than a noun. 

Social Construction and Negotiation of 
Meaning 

Asad (1980) argues that it is the production of 
essential meanings — in other words, the production 
of culture — in a given society that is the problem to 
be explained, not the definition of culture itself. In line 
with this, a number of recent studies has attempted to 
examine the negotiation of meaning and the pro- 
cesses of meaning making employed by different 
individuals or groups in cyberspace communications, 
and make use of less- (or non-) essentialist intercul- 
tural and/or communications theory in their research 
design and analysis. Reeder et al. (2004), for ex- 
ample, prefer a Vygotskyan social-constructivist 
stance in which the construction of identity is the 
starting point in their investigation of online intercul- 
tural communication. They interpret intercultural 
patterns of online communication in the light of cross- 
disciplinary theories from sociolinguistics, applied 
linguistics, genre and literacy theory, and aboriginal 
education. Belz (2003) brings a Flallidayan (1978, 
1994) linguistic approach (appraisal theory) to her 
evaluation of intercultural e-mail communications 
without making a priori predications based on the 
communicators’ nationalities. In his analysis of an 
intercultural e-mail exchange, O’ Dowd (2003) builds 
upon Byram’s (1997) notion of intercultural compe- 



tence, another theoretical perspective that focuses 
on the negotiation of a communicative mode that is 
satisfactory to all interlocutors. Choi and Danowski 
(2002), meanwhile, base their research on theories 
of social networks; they discuss their findings with 
reference to core-periphery theories of network 
communication and to the online power-play nego- 
tiations of dominant and minority cultures. Alterna- 
tively, Gunawardena, Walsh, Reddinger, Gregory, 
Lake, and Davies (2002) explore the negotiation of 
the face online, building on face theory developed 
by theorists such as Ting-Toomey (1988). Yetim 
(2001) suggests that more attention must be paid to 
the importance of metacommunication as the site of 
clarification of meaning. Thorne (2003) offers an- 
other conceptual framework that draws on an as- 
sessment of “discursive orientation, communicative 
modality, communicative activity and emergent in- 
terpersonal dynamics” (p. 38) in the analysis of 
intercultural engagement online. 

A few authors have recently developed new 
theoretical perspectives on online intercultural com- 
munication. Benson and Standing (2000) propose a 
systems theory of culture that emphasizes culture 
as a self-referential system of rules, conventions, 
and shared understandings rather than as a set of 
categories. They go on to explain perspectives on 
technology and communication as emergent prop- 
erties of culture systems that express core attitudes 
relating to various social contexts. Postmodern 
theorists such as Poster (2001) argue that 
cyberspace requires a new and different social and 
cultural theory that takes into account the specific 
qualities of cyberspace. In cyberspace, he argues, 
individuals and groups are unable to position and 
identify their ethnicity via historically established 
relations to physical phenomena such as the body, 
land, physical environment, and social-political struc- 
tures. Instead, cultural identities in cyberspace, 
constructed in new and fluid ways, are a temporary 
link to a rapidly evolving and creative process of 
cultural construction. 

Understanding Cultural Influences by 
Examining the Interface 

Not surprisingly, this intensive focus on defining the 
culture of online communicators has tended to 
distract attention from the original question: what is 
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happening at the intercultural interface? Indeed, 
Thornton (1988) has argued, “Part of the problem 
that besets our current efforts to understand culture 
is the desire to define it” (p. 26). Abdelnour-Nocera 
(2002) has proposed a number of theoretical per- 
spectives that he believes may be more effective for 
carefully examining the role the computer interface 
plays as a culturally meaningful tool in intercultural 
interaction. A situated-action perspective, already 
commonly referenced in the HCI literature and 
based on Suchman’ s work (1987), may, he suggests, 
place more useful emphasis on the unique context 
and circumstances of intercultural communication 
events online because it emphasizes the interrela- 
tionship between an action (here, communication) 
and its context of performance. Context here is not 
simply a vague backdrop to communication; rather, 
it is a complex constructed by users as they make 
selective sense of their environment of interaction 
and of each other based on their goals and resources 
(and skills). 

Alternatively, what Abdelnour-Nocera (2002) 
calls the semiotic approach focuses on technological 
aspects of the online communicative interface that 
are subject to cultural interpretation (here, icons, 
headings, text, and pictures). The context of use and 
user culture are considered, but are understood to be 
the source of meaning-making, interpretive strate- 
gies that must be matched in efforts to construct 
meaningful technological interfaces for communica- 
tion. 

Also emphasizing the importance of situation and 
context, Bucher (2002) proposes that a more mean- 
ingful approach to understanding the relationship 
between Internet communication and culture must 
examine the role of an interactive audience, and 
especially their communicative and intercultural com- 
petence (although he does not define the latter). 
Bucher also explores the phenomenon of trust devel- 
opment in cyberspace communications as a key 
feature of the online context. The disembodied 
nature of communication in cyberspace, says Bucher, 
means a loss of control — of time, of space, of 
content, of communicators — and a sensation of in- 
formational risk that can only be overcome through 
trust. Trust is, however, at the disposal of the 
audience or listener, not the speaker. 



Implications for Interface Design 

Operationalizing our fragmentary understanding of 
computer- mediated intercultural communication pro- 
cesses is a challenge. Neat predictive formulas for 
cultural behaviours in online environments are at- 
tractive to managers and corporations because they 
seem to facilitate the easy modification of platforms, 
interfaces, and environments for different catego- 
ries of users. And indeed, the technology-interna- 
tionalization and -localization discourse continues to 
be dominated by the Hofstede model (Abdelnour- 
Nocera, 2002). Examples of this localization ap- 
proach include Abdat and Pervan’s (2000) recom- 
mendations for design elements that minimize the 
communicative challenges of high-power distance in 
Indonesian groups, Heaton’s (1998b) commentary 
on technology and design preferences of Japanese 
and Scandinavian users, Onibere, Morgan, Busang, 
and Mpoeleng’s (2001) unsuccessful attempt to 
identify localized interface design elements more 
attractive to Botswana users, Turk and Trees’ (1998) 
methodology for designing culturally appropriate 
communication technologies for indigenous Austra- 
lian populations, and Evers’ (1998) portfolio of cul- 
turally appropriate metaphors for human-computer 
interface design. 

Such design approaches, founded on essentialist 
classification theories of culture, though attractive, 
remain problematic because their foundational theo- 
ries are problematic, as discussed. They do not offer 
any practical assistance to designers constructing 
online environments for culturally heterogeneous 
groups of cyberspace communicators, although such 
groups are rapidly becoming the cyberspace norm. 

FUTURE TRENDS 

Unfortunately, intercultural theories that highlight 
cultural fluidity , the role of context, and the intercul- 
tural negotiation of meaning are much more difficult 
to incorporate into approaches to interface design. 
Nevertheless, a few groups have initiated projects 
intended to make human-computer interfaces more 
culturally inclusive (rather than culturally specific). 
Foremost are Bourges-Waldegg and Scrivener (1998, 
2000) who have developed an approach they call 
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“meaning in mediated action” (MIMA), which builds 
on the semiotic perspective discussed by Abdelnour- 
Nocera (2002). Rather than attempting to design for 
individual cultures, this approach hopes to help de- 
signers understand context, representations, and 
meaning, and allow them to design interfaces that 
are more generally accessible. 

One of the few proposed design strategies that 
makes use of situated-action perspectives (see above) 
and tries to accommodate the great variability in user 
context, changing user requirements, and fast-paced 
technology evolution is a scenario-based design 
approach proposed by Carroll (2000). Although this 
design approach does not address Internet-mediated 
intercultural communication issues explicitly, Carroll 
explains that it is a methodological tradition that 
“seeks to exploit the complexity and fluidity of 
design by trying to learn more about the structure 
and dynamics of the problem domain” rather than 
trying to control this complexity (p. 44). 

As the design and evaluation of interfaces and 
environments for intercultural communication con- 
tinue, it will also be important to explore whether 
different communicative technologies may actually 
constitute different kinds of cyberculture or mediate 
different kinds of intercultural exchange. In current 
literature, studies examining intercultural e-mail ex- 
change predominate, although conclusions and im- 
plications are often extrapolated to all Internet and 
communication technologies. A smaller number of 
studies investigates intercultural communication in 
asynchronous forums and discussion boards, in group- 
conferencing platforms, in newsgroups, and via syn- 
chronous communications technologies. Even fewer 
discuss cultural implications for other human-internet 
interfaces such as Web sites and graphics. As yet, 
little or no analysis exists of intercultural communi- 
cation via current cutting-edge communication plat- 
forms such as Weblogs and wikis (for detailed 
references, see Macfadyen et al., 2004). 

CONCLUSION 

Ironically enough, for an endeavour dedicated to 
exploring the lived reality of cultural diversity and 
dynamic change, a focus on defining culture itself 
may actually be inhibiting our ability to examine and 
understand the real processes of intercultural ex- 



change that occur in the virtual world of cyberspace. 
Instead, we may need to look beyond classification 
systems for individual communicators to the pro- 
cesses that occur at their interface; if we are lucky, 
more information about cultural identity may then 
become visible at the edge of our vision. Hewling 
(2004) makes use of the well-known optical-illusion 
image of two mirrored faces in profile — which can 
also be seen as a central goblet — to argue for 
another way of seeing intercultural encounters in 
cyberspace. She suggests that the use of Hofstedean 
ideas of culture can result in a focus on only the 
faces in the picture, while the more critical field of 
exploration is the mysterious space in between. Is it 
a goblet? Or, is it another kind of collectively shaped 
space? Bringing Street’s (1993) ideas to the worlds 
of cyberspace, Raybourn, Kings, and Davies (2003) 
have suggested that intercultural interaction online 
involves the construction of a third culture: a pro- 
cess, not an entity in itself. While this culture (or 
multiplicity of situational cultures) may be influ- 
enced by cultures that communicators bring to each 
exchange, more insight may be gained by investigat- 
ing the evolving processes and tools that individuals 
invoke and employ to negotiate and represent per- 
sonal and group identity, and collective (communica- 
tive) construction of meaning. What role does lan- 
guage, literacy, and linguistic competence play? 
Does this creative process occur differently in text- 
only and media-rich online environments? Answers 
to such questions will surely be relevant to HCI 
practitioners because they will illuminate the need to 
stop viewing human-computer interfaces as tools for 
information exchange and will expose the persistent 
shortcomings of design approaches that attempt to 
match contextual features with supposedly static 
cultural preferences. These interfaces will more 
importantly be viewed as the supporting framework 
that may foster or hinder the creative communica- 
tion processes in cyberspace that are the foundation 
of successful intercultural communication. 
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KEY TERMS 

Culture: Multiple definitions exist, including es- 
sentialist models that focus on shared patterns of 



learned values, beliefs, and behaviours, and social- 
constructivist views that emphasize culture as a 
shared system of problem solving or collective mean- 
ing making. The key to the understanding of online 
cultures — for which communication is as yet domi- 
nated by text — may be definitions of culture that 
emphasize the intimate and reciprocal relationship 
between culture and language. 

Cyberculture: As a social space in which hu- 
man beings interact and communicate, cyberspace 
can be assumed to possess an evolving culture or set 
of cultures (cybercultures) that may encompass 
beliefs, practices, attitudes, modes of thought, 
behaviours, and values. 

Cyberspace: While the Internet refers more 
explicitly to the technological infrastructure of net- 
worked computers that make worldwide digital com- 
munications possible, cyberspace is understood as 
the virtual places in which human beings can com- 
municate with each other, and that are made pos- 
sible by Internet technologies. Levy (2001) charac- 
terizes cyberspace as “not only the material infra- 
structure of digital communications but... the oce- 
anic universe of information it holds, as well as the 
human beings who navigate and nourish that infra- 
structure.” 

Essentialism: The view that some properties 
are necessary properties of the object to which they 
belong. In the context of this article, essentialism 
implies a belief that an individual’ s cultural identity 
(nationality, ethnicity, race, class, etc.) determines 
and predicts that individual’s values, communicative 
preferences, and behaviours. 

Intercultural: In contrast to multicultural (which 
simply describes the heterogeneous cultural identi- 
ties of a group), cross-cultural (which implies some 
kind of opposition), or transcultural (which has been 
used to suggest a cultural transition), intercultural is 
used to describe the creative interactive interface 
that is constructed and shared by communicating 
individuals from different cultural backgrounds. 

Technoliteracy: A shorthand term referring to 
one’s competence level, skill, and comfort with 
technology. 
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INTRODUCTION 

The goal of educational methods is to allow the pupil 
the acquisition of knowledge. Even so, the way in 
which this aim is pursued originates four different 
currents of methods sorted by two criteria: (1) who 
leads the educational process and (2) requirement of 
pupil physical attendance. Regarding the former 
criterion, the process may be conducted either by the 
teacher — Teaching-Oriented Process — or by the 
pupil — Learning- Oriented Process. Obviously, both 
processes have the same aim: the interiorization and 
comprehension of knowledge by the pupil. But the 
difference between them is based on the distinctive 
procedure followed in each case to achieve the 
common goal. Regarding the second criterion, the 
methods may or may not require pupil attendance. 

Bearing in mind this classification, four different 
types of educational methods could be described: 

1. Teaching Method: This includes the already 
known classic educational methods, the Con- 
ductivity Theory (Good & Brophy, 1990) being 
the foremost one. This method is characterized 
by the fact that the teacher has the heavier role 
during education — the transmission of knowl- 
edge. 



2. E-Teaching Method: This second type comes 
from the expansion and popularity of communi- 
cation networks, especially the Internet. This 
method brings the teacher to the physical loca- 
tion of the pupil; one of its most important 
representative elements is the videoconference. 

3. Learning Method: This constitutes a new 
vision of the educational process, since the 
teacher acts as a guide and reinforcement for 
the pupil. The educational process has the 
heavier role in this method. In other words, the 
teacher creates a need for learning and after- 
wards provides the pupil with the necessary 
means in order to fill these created requests. 
Piaget Constructionist Theory is one of the 
most remarkable methods for this (Piaget, 1972, 
1998). 

4. E-Learning Method: This method is sup- 
ported both by learning methods and by the 
expansion of communication networks in order 
to facilitate access to education with no physi- 
cal or temporal dependence from pupil or 
teacher. As in learning methods, the pupil, not 
the teacher, is the one who sets the learning 
rhythm. 
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Table 1. Functional summary of main e-learning applications 





Moodle 


Ilias 


ATutor 


WebCT 


BlackBoard 


QSTutor 


Course Manager 


v 


V 


V 


V 


V 


V 


Content Manager 


v 


V 


V 


V 


V 


V 


Complementary 

Readings 


v 






■i 


V 




FAQs 








V 


V 


V 


Notebook 


V 


V 


V 




V 




Search 




V 


V 








Chat 


V 


V 


V 


V 


V 


V 


Videoconference 












V 


Forum 


v 


V 


V 


V 


V 


V 


E-mail 
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Each of these types of educational methods may 
be suitable for a given context, the e-learning sys- 
tems being the preferred ones in the following 
circumstances: 

1. When looking for a no-attendance-required 
educational method. 

2. When the pupil, not the teacher, wants to set 
the educational rhythm. This choice might be 
based on several reasons, ranging from the 
need of adaptation to the availability of a pupil 
(i.e., to achieve temporal independence), to the 
consideration of learning as a more accurate 
approach than teaching, bearing in mind a 
particular application context (Pedreira, 2003). 

3. When the knowledge to be transmitted is to be 
accessible to a high number of pupils. In teach- 
ing methods, the teacher is the one who trans- 
mits knowledge and supervises the pupils ; there- 
fore, the quality of the education is influenced 
by the number of pupils. Nevertheless, in e- 
learning the core of the educational process is 
the relationship between pupil and didactical 
material, with the teacher acting as a consult- 
ant. In this way, a teacher could pay attention 
to a higher number of pupils without causing 
any damage to the quality of the education. 

This article is focused both on the study of e- 
learning systems and on the application procedure 
for this new discipline. The Background section is a 
brief discussion regarding currently used e-learning 
systems and their points of view. The Main Focus of 
the Article section suggests a new focus for this type 
of system in an attempt to solve some shortages 



detected in already existing systems. The Future 
Trends section introduces some guidelines that may 
conduct the evolution of this discipline in the future. 
Finally, the Conclusion section presents the conclu- 
sion obtained. 



BACKGROUND 

Nowadays, there are numerous applications that are 
self-named as e-learning tools or systems. Table 1 
shows the results of the study regarding main iden- 
tified applications such as Moodle (http://moodle.org), 
Ilias (http://www.ilias.uni-koeln.de/ios/index-e.html), 
ATutor (http://www.atutor.ca/atutor), WebCT (http:/ 
/www. webct.com), BlackBoard (http://www. 
blackboard.net), and QSTutor (http://www. 
qsmedia.es). Each of these applications has been 
analyzed from the point of view of the functionality 
to which it gives support. As can be noticed in the 
table, these applications are based mainly on docu- 
ment management and provide a wide range of 
communication possibilities (especially forum and 
chat) and agendas. 

Nevertheless, and despite the increasing appear- 
ance of e-learning applications, the point of view of 
this discipline currently is being discussed. This is 
due to the fact that, despite the important conceptual 
differences that e-learning has with classical teach- 
ing methods, the developers of that type of applica- 
tion usually operate with the same frame of mind as 
with classical methods; that is, an editorial mindset. 
In other words, it is common to find the situation in 
which an e-learning application merely is reduced to 
a simple digitalization and distribution of the same 
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contents used in classical teaching (Martinez, 2003). 
In this scenario, pupils read content pages that have 
been structured in an analogous way to student books 
or traditional class notes, using multimedia applica- 
tions with self-evaluating exercises in order to verify 
the assimilation of what previously has been read. 

The systems developed in this way, and which 
should not be considered as e-learning but as e- 
reading (Martinez, 2002), are inappropriate, since 
technology must not be seen as a purpose in itself but 
as a means that eases the access to education. Then 
again, the docent material that has been elaborated as 
described needs attendance to an explanative lesson; 
therefore, it is not enough for pupils to auto-regulate 
their apprenticeship. All that has been said should 
induce a change in the existing orientation of this 
discipline, paying more attention instead to the elabo- 
ration and structuration of docent material. 



MAIN FOCUS OF THE ARTICLE 



these modules should be supported by a communi- 
cation module. 
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Pupil Information Module 



This module provides pupils and teacher with infor- 
mation regarding the former. More specifically, this 
information should include for every pupil not only 
his or her personal information (especially contact 
data) together with an academic and/or profes- 
sional profile, but also more dynamic aspects such 
as current courses and levels achieved, along with 
the evolution related to what was expected and a 
reflection of problems that might have been aroused. 

Teacher Information Module 



This module makes teacher information available 
for pupils. In this way , a given pupil should know not 
only how to contact a specific teacher but also the 
topics in which this teacher could help him or her. 



In this situation, the present work intends to palliate 
the shortages previously identified by means of the 
definition of the basic structure; that is, any docent 
material should have to accurately achieve the goal 
of any e-learning application — the appropriate trans- 
mission of knowledge. In order to obtain this struc- 
ture, the selected route has been the knowledge 
management (KM) discipline; one of its main pur- 
poses is the determination of knowledge representa- 
tion intended for easier assimilation. 

The following subsections detail not only the pro- 
posed structure for development of an e-learning 
application but also the defined ontology that should 
be used to structure one of its key aspects: the 
knowledge base. 

Proposed Structure for E-Learning 
Systems 

Three key elements may be identified at any educa- 
tional process: pupil, teacher, and contents/docent 
material. E-learning is not an exception to this, there- 
fore, any system that may give support to this disci- 
pline should be structured with the same triangular 
basis (Friss, 2003) by means of the definition of 
modules regarding the three described factors. All 



Contents Module 

The different resources available for pupils in order 
to acquire, consolidate, or increase their knowledge 
are contained in this module. The organization 
proposed for this module is based on the three basic 
pillars of KM discipline for the structure setting of 
the knowledge to be transmitted: knowledge base, 
lessons learned, and yellow pages (Andrade et al., 
2003). These pillars, after subtle adaptations, are 
perfectly valid for the organization of this module: 

1. The submodule named as knowledge base 
constitutes the central nucleus, not only of this 
module but also of e-learning, since it is the 
container for each course-specific content. 
Given its importance, this aspect will be ap- 
proached later on. 

2. The lessons learned (Van Heijst, Van der 
Spek & Kruizinga, 1997) submodule contains 
the experiences of both pupils and teacher 
regarding knowledge base. It is important to 
point out not only the positive experiences, like 
hints for a better solution, but also the negative 
ones, such as frequent mistakes during the 
applic ation of knowledge . 
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Figure 1. Structure of e-learning systems 




3. The yellow pages (Davenport & Prusak, 2000) 
help the pupil to identify the most suitable 
teacher in order to solve a particular question 
as well as to distinguish the appropriate re- 
sources (books, class notes, etc.) for digging 
out a specific topic. 

Communication Module 

This module, as shown in Figure 1, gives support to 
the previous ones. Its main task is giving users 
access to the e-learning system. By this means, 
teachers and pupils have access not only to the 
previous modules but also to communication and 
collaboration among the different system users. It is 
important to point out that this communication should 
not be limited to that between a pupil and a teacher, 
but in some domains, it also could be interesting to 
allow pupils and even teachers to intercommunicate. 

Ontology for the Definition of 
Knowledge Base 

As previously mentioned, the knowledge base acts 
like a storage space and a source of specific con- 
tents for the pupils to obtain the desired knowledge. 
The type of organization of these contents should 
allow significant learning in which pupils should be 
able to assimilate, conceptualize, and apply the ac- 
quired knowledge to new environments (Ausubel, 
David, Novak, & Hanesian, 1978; Michael, 2001). 



In order to achieve this type of structure, the first 
step is the partition of the course into topics and 
subtopics for the identification of the specific les- 
sons. This division will generate a subject tree that 
represents a useful tool for the pupil, so that he or she 
may understand the global structure of the course. 

The following step should be the description of 
those lessons that have been identified. To achieve 
this, it is proposed that every lesson should be 
preceded by a brief introduction. Having done this, it 
is suggested to generate a genuine need of learning 
into the pupil, aiming for an increased receptiveness 
of the contents. Once the lesson has been displayed, 
the pupil should be guided with regard to the practi- 
cal application of previously acquired theoretical 
knowledge. As a final stage, a verification of the 
evolution might be performed by means of an evalu- 
ation of the acquired knowledge. 

Figure 2 shows the proposed ontology for defini- 
tion and implementation of the knowledge base, 
supported by these components. As can be noticed, 
on some occasions, one or more components might 
not be applicable for a specific lesson. The neces- 
sary distinction of the relevant aspects for each case 
should be performed by the teacher. These compo- 
nents are detailed next. 

Introduction of the Lesson 

The aim of this component is to show the pupil a 
global vision of the lesson that is going to start. To 
achieve this, not only the purposes of its perfor- 



Figure 2. Ontology for the development and 
implementation of a knowledge base 
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mance should be outlined clearly, but also the con- 
tents have to be described briefly. In addition, it 
should be made noticeable that lessons are 
codependent regarding themselves and/or the rest of 
the basic knowledge. It is also important to specify 
clearly what the basic knowledge requirements are 
for a successful approach. 

Generation of the Need 

The motivation of the pupil constitutes a key aspect 
of learning. A good strategy for achieving this is to 
stimulate the pupil’ s understanding of the usefulness 
of the knowledge that he or she is going to acquire 
(Wilkerson & Gijselaers, 1996). This purpose might 
be accomplished by generating need exercises, which 
consisting of a problem proposal whose method of 
solving is not known by the learner. 

Content Developing 



This level can be divided into two sublevels: 

a. Strategic: Specifies what to do and where 
as well as when, in what order, and why to 
do it. This knowledge handles the functional 
decomposition of each operation in its con- 
stituent steps as well as the order in which 
they have to be undertaken. 

b. Tactical: Specifies how to do the tasks and 
under what circumstances they have to be 
done. This type of knowledge is associated 
with the execution process for each strate- 
gic step of the latest level. 

2. Static Knowledge: Conforms the structural 
or declarative domain knowledge and specifies 
the elements — concepts, relations, and proper- 
ties — that are handled when carrying out the 
tasks (i.e., handled by tactical knowledge) and 
the elements that are the basis for the decisions 
(i.e., implied in the decisions of strategic knowl- 
edge). 



K 



This is the component where the knowledge in- 
cluded in the lesson is maintained and transmitted. 
That knowledge may be dynamic or static, both 
constituting the so-called functional taxonomy of 
knowledge, which more specifically makes a distinc- 
tion between them (Andrade et al., 2004a): 

1. Dynamic Knowledge: Knowledge related to 
the behavior that exists in the domain; that is, 
functionality, action, processes, and control. 



Therefore, a lesson should give support to the 
types of knowledge that have been identified. With 
this intention and given the different characteristics 
of each of them, they should be described on the 
basis of different parameters. In this way, Table 2 
shows in schematic fashion those aspects that should 
be kept in mind when describing a knowledge asset, 
depending on the type of considered knowledge. 

This taxonomy has been used by the authors not 
only for conceptualization and modeling of problems 



Table 2. Aspects to be considered when describing a type of knowledge 



Level 


Characteristics to Consider 


Strategic 


Functional decomposition of each operation in its constituent substeps. 



Execution order for the identified steps for operation fulfilling. 
Pre-conditions and post-conditions for the execution of every identified step. 
Entries and exits of each identified step. 

Responsible for each step. 



Tactical Operation mode — algorithm, mathematical expression, or inference — of each 

identified step. 

Limiting aspects in its application. 

Elements — concepts, relations, and properties — that it handles and produces. 


Declarative Concepts 


Relevant properties. 

Relations in which it participates. 


Relations 


Properties, concepts, and/or relations participating in the relation. 
Limiting aspects in its application. 


Properties 


Type of value. 

Possible values that it can take. 
Source that provides its value(s). 
Limiting aspects in its application. 
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(Andrade et al., 2004b) blit also for KM systems 
definition and implementation (Andrade et al., 2003). 
As a result, it has been concluded that the organiza- 
tion of knowledge, based on this taxonomy, facili- 
tates its transmission, understanding, and assimila- 
tion. This statement is supported by the fact that 
functional taxonomy is consonant with human 
mindset, and therefore, it is sensitive to people 
(Andrade et al., 2004a). 

Practical Application 

Once the pupil has acquired the required knowledge 
assets, he or she should put them into practice for 
proper interiorization and assimilation. This task will 
be performed in two phases : consolidation and train- 
ing. 

The consolidation phase intends to reinforce 
theory by means of practice. With this aim, a group 
of examples will be made available for the pupil, who 
will be guided through his or her resolution by means 
of the rationalization of every decision made. 

During the training phase, the aim is for the pupil 
to apply the resolution method straightforward by 
the use of a group of exercises increasingly com- 
plex. 

It should be highlighted that when exercises are 
more similar to real circumstances, the results ob- 
tained will be enhanced; consequently, exercises 
and examples should be as realistic as possible. 

Evaluation 

The evaluation of the acquired knowledge will pro- 
vide useful information for both pupils and teachers 
who would be able to verify a pupil’s evolution 
regarding what was expected and whether pre- 
established levels have been accomplished or not. 
Likewise, the evolution would allow the detection of 
any learning problem in order to handle it appropri- 
ately. 

FUTURE TRENDS 

As previously mentioned in this article, e-learning 
discipline has been shown as an inappropriate ap- 
proach; it is the mere digitalization and publication of 
the same docent material commonly used at atten- 



dance lessons, making technology not a tool but a 
goal in itself. The perception of this situation should 
induce a change when dealing with the development 
of e-learning systems in terms of fitting the definition 
and structuration of the docent material to the needs 
that human beings might present. In other words, as 
human computer interaction (HCI) emphasizes, hu- 
man beings should not be servants of but served by 
technology. 

Shown here is a first attempt toward the attain- 
ment of this objective. A lot of work remains to be 
done; therefore, it is predicted that future investiga- 
tions will be focused on a more exhaustive definition 
regarding the way in which docent material should 
be elaborated in order to be more easily assimilated 
by pupils. 

CONCLUSION 

The present work has catalogued the different exist- 
ing educational methods that attend both to who may 
lead the process and to whether physical attendance 
of the pupil is required or not. This classification has 
allowed, for every method and especially for e- 
learning, the identification of their inherent particu- 
lars. 

Nonetheless, most of the so-called e-learning 
systems do not properly support the intrinsic charac- 
teristics of these types of systems, since they merely 
provide an electronic format for the docent material 
of classical teaching. 

KM techniques have been used with the aim of 
providing an answer to this situation. This discipline 
tries to find the optimal strategies for the represen- 
tation and transmission of knowledge so that its 
latter comprehension might be facilitated. Following 
this, a basic structure for e-learning systems was 
defined using modules and submodules. Similarly, 
after the identification of the knowledge base as one 
of the key aspects of the mentioned structure, a 
specific ontology was described for its definition and 
implementation. 

Finally, it should be mentioned that the attainment 
of auto-content and auto-explanative docent mate- 
rial for an easier acquisition of knowledge could be 
achieved by the use of the defined ontology. 
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KEY TERMS 

Education: Formation or instruction process of 
an individual by means of the interiorization and 
assimilation of new assets of knowledge and capa- 
bilities. 

E-Learning: Discipline that applies current in- 
formation and communications technologies to the 
educational field. This discipline tries to facilitate the 
learning process, since its methods do not depend on 
physical location or timing circumstances of the 
pupil. 

Knowledge: Pragmatic level of information that 
provides the capability of dealing with a problem or 
making a decision. 

Knowledge Management: Discipline that in- 
tends to provide, at its most suitable level, the 
accurate information and knowledge for the right 
people whenever they may need it and at their best 
convenience. 

Learning: Educational process for self educa- 
tion or instruction using the study or the experience. 
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Significative Learning: Type of learning in 
which contents are related in a substantial and not 
arbitrary fashion with what the pupil already knows. 



Teaching: Educational process wherein a 
teacher, using the transmission of knowledge, edu- 
cates or instructs someone. 
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INTRODUCTION 

Virtual environments provide a computer-synthe- 
sized world in which users can interact with objects, 
perform various activities, and navigate the environ- 
ment as if they were in the real world (Sherman & 
Craig, 2002). Research in a variety of fields (i.e., 
software engineering, artificial intelligence, com- 
puter graphics, human computer interactions, elec- 
trical engineering, psychology, perceptual science) 
has been critical to the advancement of the design 
and implementation of virtual environments. Appli- 
cations for virtual environments are found in various 
domains, including medicine, engineering, oil explo- 
ration, and the military (Burdea & Coiffet, 2003). 

Despite the advances, navigation in virtual envi- 
ronments remains problematic for users (Darken & 
Sibert, 1996). Users of virtual environments, without 
any navigational tools, often become disoriented and 
have extreme difficulty completing navigational tasks 
(Conroy, 2001; Darken & Sibert, 1996; Dijk et al., 
2003; Modjeska & Waterworth, 2000). Even simple 
navigational tools are not enough to prevent users 
from becoming lost in virtual environments. Natu- 
rally, this leads to a sense of frustration on the part 
of users and decreases the quality of human-com- 
puter interactions. In order to enhance the experi- 
ence of users of virtual environments and to over- 
come the problem of disorientation, new sophisti- 
cated tools are necessary to provide navigational 
assistance. We propose the design and use of navi- 
gational assistance systems that use models derived 
through data mining to provide assistance to users. 
Such systems formalize the experience of previous 



users and make them available to new users in order 
to improve the quality of new users’ interactions 
with the virtual environment. 



BACKGROUND 

Before explaining any navigational tool design, it is 
important to understand some basic definitions about 
navigation. Wayfinding is the cognitive element of 
navigation. It is the strategic element that guides 
movement and deals with developing and using a 
cognitive map. Motion, or travel, is the motoric 
element of navigation. Navigation consists of both 
wayfinding and motion (Conroy, 2001; Darken & 
Peterson, 2002). 

Wayfinding performance is improved by the ac- 
cumulation of different types of spatial knowledge. 
Spatial knowledge is based on three levels of infor- 
mation: landmark knowledge, procedural knowledge, 
and survey knowledge (Darken & Sibert, 1996; 
Elvins et al. , 200 1 ). Before defining landmark knowl- 
edge, it is important to understand that a landmark 
refers to a distinctive and memorable object with a 
specific shape, size, color, and location. Landmark 
knowledge refers to information about the visual 
features of a landmark, such as shape, size, and 
texture. Procedural knowledge, also known as route 
knowledge, is encoded as the sequence of naviga- 
tional actions required to follow a particular route to 
a destination. Landmarks play an important role in 
procedural knowledge. They mark decision points 
along a route and help a traveler recall the proce- 
dures required to get to a destination (Steck & 
Mallot, 2000; Vinson 1999). 
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A bird’s eye view of a region is referred to as 
survey knowledge. This type of knowledge contains 
spatial information about the location, orientation, 
and size of regional features. However, object loca- 
tion and interobject distances are encoded in terms 
of a geocentric (i.e., global) frame of reference as 
opposed to an egocentric (i.e., first-person) frame of 
reference. Landmarks also play a role in survey 
knowledge. They provide regional anchors with 
which to calibrate distances and directions (Darken 
& Sibert, 1996; Elvins et al., 2001). 

The quality of spatial knowledge that a user has 
about a virtual environment determines his or her 
performance on a wayfinding task. Any navigational 
assistance provided by a tool is aimed to assist the 
user to gain spatial knowledge about the environ- 
ment. Therefore, a key element to the success of 
any navigational tool is how effective it is in repre- 
senting and providing spatial knowledge that is easy 
to understand and useful from the perspective of the 
user. 

In the past, different navigational tools and tech- 
niques to improve wayfinding have been included in 
the design of virtual environments. Maps and grids 
have been introduced to bring legibility to virtual 
environments and to improve wayfinding perfor- 
mance (Darken & Sibert, 1996). Personal agents 
have been used that can interact with the user and 
provide verbal navigational assistance (Dijk et al., 
2003). Due to their popularity, there also has been 
tremendous focus on the use and design of land- 
marks to aid in wayfinding (Elvins et al. , 200 1 ; S teck 
& Mallot, 2000; Vinson, 1999). The achievement of 
previous researchers has been significant, but the 
area of navigation in virtual environments still re- 
mains an open research topic. 

WAYFINDING: THE DATA- MINING 
APPROACH 

Data mining is the process of discovering previously 
unknown patterns, rules, and relationships from data 
(Han & Kamber, 2001). A Knowledgeable Naviga- 
tional Assistance System (KNAS) is a tool that 
employs models derived from mining the naviga- 
tional records of previous users in order to aid other 
users in successfully completing navigational tasks. 
For example, the navigational records of previous 



users may be mined to form models about common 
navigational mistakes made by previous users. A 
KNAS could be designed to use these models to help 
users avoid backtracking and making loops. Another 
example would be to derive models of frequent 
routes taken by previous users. These frequent 
routes may defy traditional criteria for route selec- 
tion but have hidden advantages. A KNAS could be 
designed to use these models and recommend these 
frequent routes to users (Kantardzic et al., 2004). 

The process of designing a KNAS involves three 
distinct steps. The first step is recording the naviga- 
tional data of users. Selection of the group of users 
that will have their navigational behavior recorded is 
dependent upon the application. Ideally, this group of 
users should be experienced with the system 
(Peterson & Darken, 2000; Peterson, et al., 2000). 
The data that are recorded can include both spatial 
and non-spatial attributes pertaining to the naviga- 
tion of users. Examples of recorded data could 
include the landmarks and objects visited, routes 
traversed by users, as well as information on longi- 
tude, latitude, elevation, and time (Shekhar & Huang, 
2001; Huang et al., 2003). 

The second step is the actual mining of data. In 
most cases, data will need to be preprocessed before 
applying the mining algorithms (Kantardzic, 2003). 
The data-mining process will result in models that 
will be used by the KNAS. 

The third and final step is actual implementation 
of the KNAS and the corresponding interface. The 
interface of the KNAS needs to allow the user to 
issue navigational queries, and the KNAS must use 
the derived data-mining models in order to formulate 
a reply. Figure 1 depicts the three steps necessary 
for construction of a KNAS: recording of the navi- 
gational data, the data mining process to form mod- 
els, and the implementation of the KNAS. 

To better demonstrate the concepts and ideas 
behind a KNAS, the construction of a KNAS is 
discussed, which is capable of recommending routes 
of travel from one landmark to another landmark. As 
previously discussed, landmarks are an important 
component in wayfinding and commonly are found in 
various virtual environments (Steck & Mallot, 2000; 
Vinson, 1999). Some may argue that relying on 
knowledge extracted from the records of previous 
users to derive routes is an overhead. After all, there 
are already tools that can produce route directions 
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Figure 1. A Knowledgeable Navigational Assistance System (KNAS) guiding inexperienced users 
based on the experience of previous users 



K 




on how to get from one point to another without the 
data-mining process. An example would be using a 
tool on the Internet to get driving directions from 
one’s home to a store. However, these tools usually 
take a traditional approach when producing route 
directions where the primary factor of consideration 
is distance. Therefore, the recommended route is the 
shortest possible route. 

In a complex virtual environment, the shortest 
route is not necessarily the preferred route, because 
there are often other criteria included in defining an 
appropriate route. These other criteria include choos- 
ing a route based on the amount of scenery, the 
amount of obstacles encountered along the way, the 
educational value associated with the route, and so 
forth. If additional criteria besides distance are im- 
portant for users of a virtual environment, then the 
model derived from the data-mining process will 
reflect this trend (Kantardzic, et al., 2004). 

Figure 2 introduces a two-dimensional map of a 
simple virtual city that will be used to show the 
operations of the proposed KNAS. The virtual city 
has three landmarks (L 1 , L2 , L3 ) and four roads (R 1 , 
R2, R3, R4) connecting the landmarks. The first step 
in designing the KNAS is to record the movements of 
several experienced users. For the sake of simplicity, 
each movement is recorded simply as a sequence of 
landmarks and roads traversed. A sequence of a user 



may be LI R1 L2, which means that the user started 
at landmark L 1 and used road R 1 to get to landmark 
L2. 

The next step is the discovery of the different 
routes that previous users have actually taken in 
getting from each landmark L. to each landmark L , 
for i ^ j, and i and j from 1 to the total number of 
landmarks (i.e. , three). Since the routes are stored 
as sequences of symbols, this can be accomplished 
by using a modified algorithm for sequence mining 
(Soliman, 2004). Figure 3a shows the result of 
mining routes from a hypothetical database for the 
city in Figure 2. The routes are followed by the 
corresponding count of occurrence and support. 
Figure 3b shows the final model of the most fre- 
quent routes. The KNAS will associate these most 
frequently used routes as the recommended routes 
of travel. 

The final step is the design of the KNAS that can 
use this model. For this particular KNAS, the user 
interface should allow the user to issue a query such 
as, “What are the directions for traveling from 
Landmark 1 to Landmark 3?” The reply would be 
formulated by examining the final model, as seen in 
Figure 3b and translating the sequence into direc- 
tions. For example, a simple reply would recom- 
mend, “Start at Landmark 1 , travel on Road 1 , reach 
Landmark 2, travel on Road 3, end at Landmark 3.” 
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Figure 2. A 2-D map of a simple virtual city with three landmarks and four roads 




Notice that the recommended route is not the short- 
est possible route. Further analysis may reveal that 
previous users may have preferred this route, since 
it is more scenic. 

This discussion has been kept simple for demon- 
strative proposes. In reality, the process of con- 
structing a KNAS to recommend routes from one 
landmark to another requires much more work 
(Sadeghiam et al., 2005). For example, virtual envi- 
ronments usually do not record the movement of 
users as sequences of symbols but rather as three- 
dimensional coordinates. Pattern matching tech- 
niques must be used to translate these coordinates 
into sequences. More symbols would have to be 
introduced to account for such components as inter- 
sections and boundaries encountered during naviga- 
tion. The sequences must be preprocessed in order 
to eliminate noisy data such as loops that correspond 
to backtracking and disorientation. The sequence 
mining algorithm must be efficient to deal with a 
large amount of data. When applying the sequence 



mining algorithm to find routes, all subsequences 
need to be considered. When making the final model 
of frequent sequences corresponding to recom- 
mended routes, more rigorous statistical methods 
such as confidence intervals should be used instead 
of simple percentages. 

When the final model of frequent sequences is 
built, there is no guarantee that all possible combina- 
tions of landmarks will have a frequent sequence 
registered in the final model. This is true especially 
as the complexity of the environment increases. For 
example, a virtual environment with 500 landmarks 
would require the discovery of 124,750 recom- 
mended routes. This is often difficult if the amount 
of navigational data is limited and if rigorous statis- 
tical methods are used for determining recommended 
routes. Therefore, strategies must be implemented 
to combine several recommended routes, if the route 
needed is not mined explicitly. If this is not possible, 
the KNAS would have to resort to traditional criteria 
to come up with a recommendation. 
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Figure 3. Mining the navigational data: (a) The discovered routes; (b) Final model of frequent routes 
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FUTURE TRENDS 

The technology of the KNAS is currently in its 
infancy. As virtual environments become increas- 
ingly more sophisticated, so does the database con- 
taining the data associated with the navigation and 
activities of the users. There is potentially a great 
deal of hidden knowledge contained within these 
data, and research is needed to discover ways to 
extract this knowledge. This is true especially for 
large complex virtual systems that support multiple 
users and distributed virtual environments. In addi- 
tion, research is needed in designing a complex 
KNAS that takes advantage of stream mining tech- 
niques in order to update and modify the discovered 
models as more data is accumulated. 



CONCLUSION 

A common problem faced by users of virtual envi- 
ronments is a lack of spatial orientation. This is 



extremely problematic, since successful navigation 
is crucial to derive any benefit from most virtual 
environments. Numerous tools have been intro- 
duced in the past to aid in wayfinding, but much work 
remains in this field. 

Knowledgeable navigational assistance systems 
offer an alternative to the traditional tools used in the 
past. Similar to traditional tools, a KNAS aids the 
user with navigational tasks, but the recommenda- 
tion made by a KNAS is more likely to be viewed 
positively by the end user, since the recommenda- 
tions have been formulated based on data of previ- 
ous users. Therefore, a KNAS has the potential of 
enhancing the quality of human-computer interac- 
tions within virtual environments. 



ACKNOWLEDGMENT 

This research has been funded by the National 
Science Foundation (NSF) under grant #0318128. 



393 




Knowledgeable Navigation in Virtual Environments 



REFERENCES 

Burdea, G., & Coiffet, P. (2003). Virtual reality 
technology. NJ: John Wiley & Sons. 

Conroy, R. (2001). Spatial navigation in immersive 
virtual environments. Doctoral thesis. London: 
University College London. 

Darken, R., & Peterson, B. (2002). Spatial orienta- 
tion, wayfinding, and representation. In K. Stanney 
(Ed.), Handbook of virtual environments: De- 
sign, implementation, and applications. Mahwah, 
NJ: Lawrence Erlbaum Associates. 

Darken, R., & Sibert, J. (1996). Navigating large 
virtual spaces. The International Journal of Hu- 
man-Computer Interaction, 5(1), 49-72. 

Dijk, B., Den, A., Rieks, N., & Zwiers, J. (2003). 
Navigation assistance in virtual worlds. Informing 
Science Journal, 6, 115-125. 

Elvins, T., Nadeau, D., Schul, R., & Kirsh, D. 
(2001). Worldlets: 3D thumbnails for wayfinding in 
large virtual worlds. Presence, 10(6), 565-582. 

Han, J., & Kamber, M. (2001). Data mining: Con- 
cepts and techniques . San Francisco: Morgan 
Kaufmann Publishers. 

Huang, Y., Xiong, H., Shekhar, S., & Pei, J. (2003). 
Mining confident co-location rules without a support 
threshold. Proceedings of the 18 ,h ACM Sympo- 
sium on Applied Computing (SAC), Melbourne. 

Kantardzic, M. (2002). Data mining: Concepts, 
models, methods, and algorithms. Piscataway, 
NJ: IEEE Press. 

Kantardzic, M., Rashad, S., & Sadeghian, P. (2004). 
Spatial navigation assistance system for large virtual 
environments: Data mining approach. Proceedings 
of the Mathematical Methods for Learning, Villa 
Geno, Como, Italy. 

Modjeska, D., & Waterworth, J. (2000). Effects of 
desktop 3D world design on user navigation and 
search performance. Proceedings of the IEEE 
Information Visualization, Salt Lake City, Utah. 

Peterson, B., & Darken, R. (2000). Knowledge 
representation as the core factor for developing 



computer generated skilled performers. Proceed- 
ings of the I/ITSEC, Orlando, Florida. 

Peterson, B., Stine, J., & Darken, R. (2000). A 
process and representation for modeling expert navi- 
gators. Proceedings of the 9 th Conference on 
Computer Generated Forces and Behavioral 
Representation, Orlando, Florida. 

Shadeghian, P., Kantardzic, M., Lozitsky, O., & 
Sheta, W. (2005). Route recommendations: Naviga- 
tion distance in complex virtual environments. Pro- 
ceedings of the 20 th International Conference on 
Computers and Their Applications, New Orleans, 
Louisiana. 

Shekhar, S., & Huang Y. (2001). Discovering spatial 
co-location patterns: A summary of results. Pro- 
ceedings of the 7 th International Symposium on 
Spatial and Temporal Databases (SSTD), Redondo 
Beach, California. 

Sherman, W., & Craig, A. (2002). Understanding 
virtual reality. San Francisco: Morgan Kaufmann. 

Soliman, M. (2004). A model for mining distrib- 
uted frequent sequences. Doctoral thesis. Louis- 
ville, KY : University of Louisville. 

Steck, S . , & Mallot, H. (2000) . The role of global and 
local landmarks in virtual environment navigation. 
Presence, 9(1), 69-83. 

Vinson, N. G. (1999). Design guidelines for land- 
marks to support navigation in virtual environments. 
Proceedings of the ACM Conference on Human 
Factors in Computing Systems, Pittsburgh, Penn- 
sylvania. 

KEY TERMS 

Data Mining: A process by which previously 
unknown patterns, rules, and relationships are dis- 
covered from data. 

Knowledgeable Navigational Assistance Sys- 
tem: A system that helps a user carry out naviga- 
tional tasks in a virtual environment by using data 
mining models derived from analyzing the naviga- 
tional data of previous users. 
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Landmark: A distinctive and memorable object. 

Landmark Knowledge: A type of spatial knowl- 
edge dealing with information about visual features 
of landmarks. 

Motion: The physical or motoric element of 
navigation. 

Navigation: The aggregate of motion and 
wayfinding. 

Procedural Knowledge: A type of spatial 
knowledge dealing with the navigational actions 
required in order to follow a particular route to a 
destination. 



Survey Knowledge: A type of spatial knowl- 
edge dealing with information about location, orien- 
tation, and size of regional features. 



K 



Virtual Environment: A 3-D computer-syn- 
thesized world in which a user can navigate, interact 
with objects, and perform tasks. 



Wayfinding: The cognitive element of naviga- 
tion dealing with developing and using a cognitive 
map. 
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INTRODUCTION 

Amid the many published pages of excited hyperbole 
regarding the potential of the Internet for human 
communications, one salient feature of current 
Internet communication technologies is frequently 
overlooked: the reality that Internet- and computer- 
mediated communications, to date, are communica- 
tive environments constructed through language 
(mostly text). In cyberspace, written language there- 
fore mediates the human-computer interface as well 
as the human-human interface. What are the impli- 
cations of the domination of Internet and computer- 
mediated communications by text? 

Researchers from diverse disciplines — from dis- 
tance educators to linguists to social scientists to 
postmodern philosophers — have begun to investi- 
gate this question. They ask: Who speaks online, and 
how? Is online language really text, oris it “speech”? 
How does culture affect the language of cyberspace? 
Approaching these questions from their own disci- 
plinary perspectives, they variously position 
cyberlanguage as “text,” as “semiotic system,” as 
“socio-cultural discourse” or even as the medium of 
cultural hegemony (domination of one culture over 
another). These different perspectives necessarily 
shape their analytical and methodological approaches 
to investigating cyberlanguage, underlying decisions 
to examine, for example, the details of online text, 
the social contexts of cyberlanguage, and/or the 
social and cultural implications of English as Internet 
lingua franca. Not surprisingly, investigations of 
Internet communications cut across a number of 
pre-existing scholarly debates: on the nature and 
study of “discourse,” on the relationships between 
language, technology and culture, on the meaning 
and significance of literacy, and on the literacy 
demands of new communication technologies. 



BACKGROUND 

The multiple meanings of the word “language” — 
both academic and colloquial — allow it to signify 
multiple phenomena in different analytical frame- 
works, and complicate any simple search for litera- 
ture on the language of cyberspace. This article 
surveys the breadth of theoretical and empirical 
writing on the nature and significance of text, lan- 
guage, and literacy in Internet- and computer-medi- 
ated communications, and indicates the different 
theoretical approaches employed by current re- 
searchers. In particular, this article emphasizes re- 
search and theory relevant to conceptions of the 
Internet as a site of international and intercultural 
communications — the so-called “global village” — 
and offers some reflection on the importance of 
research on online language for the field of human- 
computer interaction. 

PERSPECTIVES ON THE LANGUAGE 
OF CYBERSPACE 

Cyberlanguage as Digital Text 

Perhaps belying their perception of Internet commu- 
nications as primarily written communication (a 
perspective contested by some (Collot & Belmore, 
1996; Malone, 1995) a number of authors have 
focused on the features of digital text (and their 
impact on readers) as an approach to investigating 
cyberspace communications. A particular area of 
interest has been the development of hypertext, 
whose non-linear, non-sequential, non-hierarchical 
and multimodal (employing images, sound, and sym- 
bols as well as text) nature seemed to place it in stark 
contrast to traditional printed texts. Hypertext has 
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been hailed as a postmodern textual reality (Burbules 
& Callister, 2000; Landow, 1997; Snyder, 1996), 
making fragmentation and complex-cross-referenc- 
ing of text possible and easy. Researchers also 
argue that hypertext radically changes the nature of 
literacy, positioning the author as simply the “source,” 
and favouring a new form of open-ended “associa- 
tive” reading and thought (Burbules & Callister, 
2000; Richards, 2000). One of the first researchers 
to focus on hypertext was Kaplan (1995), who 
described how it would “offer readers multiple tra- 
jectories through the textual domain” (f 1). Kaplan 
suggests that “each choice of direction a reader 
makes in her encounter with the emerging text, in 
effect, produces that text,” and points out that while 
some hypertexts are printable, many new forms are 
native only to cyberspace, and have no printable 
equivalents. Douglas (2000), on the other hand, 
discusses ways in which hypertext may offer read- 
ers less autonomy than paper-based texts, a position 
supported by Harpold (2000) who argues that digital 
texts are “empirically fragile and ontologically in- 
consistent” (p. 129). Tuman (1995) offers a particu- 
larly strong critique of hypertext, which is, he argues, 
“ideally suited for the storing and accessing of 
diverse information, [but] not for sustained, critical 
analysis.” 

What are the implications of hypertext for hu- 
man-computer interaction? Braga and Busnardo 
(2004) argue that while hypertext media encourage 
multimodal communications, some designers (espe- 
cially novice designers) are not familiar enough with 
this type of communication because their own liter- 
ate practices tend to be anchored in verbal language 
and print-based text. Construction of effective 
hypertext, they argue, calls for new and different 
approaches to organization of information and seg- 
mentation, a recognition of the challenges of screen- 
reading and navigation, and an understanding of the 
evolving conventions of different electronic con- 
texts (Snyder, 1998) in which “electronically liter- 
ate” and novice users may have different expecta- 
tions. 

Cyberlanguage as Semiotic System 

A significant proportion of current studies of online 
language report on semiotics: the detailed and some- 
times mechanistic features — signs and symbols — of 



the linguistic systems elaborated by users in a range 
of Internet and computer-mediated communication 
venues such as email, asynchronous discussion 
boards, computer conferencing, and synchronous 
“chat” platforms. 

Many papers discuss evolving conventions of 
online communications: features, grammar, and lexi- 
cography. Most compare and contrast communica- 
tions in different venues and/or with written or 
spoken language (almost always English). They 
generally conclude that online communication is an 
intermediate stage between oral and written modali- 
ties, and some studies (Collot & Belmore, 1996) 
differentiate further between features of synchro- 
nous (online) and asynchronous (offline) digital com- 
munications. A number of papers examine in par- 
ticular the textual and graphical systems (such as 
emoticons) that users employ within online commu- 
nications to add back some of the contextual fea- 
tures that are lost in electronic communications (for 
detailed references see Macfadyen, Roche, & Doff, 
2004). Burbules (1997) meanwhile highlights the 
hyperlink as the key feature of digital texts, and 
explores some of the different roles links may play 
beyond their simple technical role as a shortcut: an 
interpretive symbol for readers, a bearer of the 
author’ s implicit ideational connections, an indicator 
of new juxtapositions of ideas. 

Kress and Van Leeuwen (1996) propose that the 
shift to multimodality facilitated by digital texts ne- 
cessitates a new theory of communication that is not 
simply based on language, but also takes into ac- 
count semiotic domains and the multiplicity of semiotic 
resources made available by digital media. While 
such resources are significant features of online 
interfaces, we caution against simplistic extrapola- 
tion to interface design models that over-privilege 
signs and symbols as mediators of meaning in Internet- 
and computer-mediated interactions, and also against 
overly simple attempt to identify sets of “culturally - 
specific” signs and symbols for user groups based on 
essentialist notions of culture. (This phenomenon 
and its associated problems are discussed more 
extensively in Macfadyen, 2006). 

New Literacies? 

In a milieu where the interface is overwhelmingly 
dominated by text, it might seem self-evident that 
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“literacy” be a key factor in determining the success 
of human-computer and human-human interaction. 
But what kind of “literacy” is required for Internet 
and computer-mediated communications? Tradition- 
ally, researchers concerned with literacy have de- 
fined it as the ability to read, write, and communicate, 
usually in a print-text-based environment. In the last 
decade, however, researchers in the field of New 
Literacy Studies (NLS) have challenged this per- 
spective, and have initiated a focus on literacies as 
social practices, moving away from individual and 
cognitive-based models of literacy (Lea, 2004). A 
social practice model of literacy recognizes that 
language does not simply represent some kind of 
objective truth, but actually constitutes meaning in a 
given context. Writing and reading, it is argued, are 
key ways in which people negotiate meaning in 
particular contexts (Street, 1984). 

Bringing NLS perspectives to the world of Internet 
and computer-mediated communication, some writ- 
ers are now countering simplistic “operational” no- 
tions of electronic literacy that have tended to focus 
solely on “performance with the linguistic systems, 
procedures, tools and techniques involved in making 
or interpreting [digital] texts” (Goodfellow, 2004). 
Instead, they highlight discussions of the equal impor- 
tance of “cultural” and “critical” aspects of literacy 
for online communications (Lankshear, Snyder, & 
Green, 2000), where the “cultural” dimension implies 
the ability to use operational skills in authentic social 
contexts and allow participation in social discourses, 
and the “critical” dimension refers to an even more 
sophisticated level of interaction with electronic dis- 
courses, including the ability to evaluate, critique, and 
redesign them. 

Theoretical perspectives on “visual literacy,” “digi- 
tal literacy,” “electronic literacy,” and “computer 
literacy” have proliferated, with important contribu- 
tions made by authors such as Warschauer (1999), 
Street (1984), Jones, Turner, and Street (1999), Snyder 
(1998) and Richards (2000). Hewling (2002) offers a 
detailed review of debates in this field. Next, a 
sampling of recent papers demonstrates that argu- 
ments tend to be polarized, positing electronic literacies 
either as continuous with existing human communi- 
cation practices, or radically new and postmodern. 

One group of research papers concentrates on the 
“new” skills required of users for communicative 
success in the online arenas of the Internet which, 



according to Thurstun (2000) comprise “entirely 
new skills and habits of thought” (p. 75). Gibbs 
(2000) extends this to suggest that new forms of 
communication are actually constructing “new forms 
of thinking, perceiving and recording” (p. 23). 
Kramarae (1999) discusses the new “visual lit- 
eracy” (p. 5 1) required of Internet communicators, 
while Abdullah (1998) focuses on the differences in 
style and tone between electronic discourse and 
traditional academic prose. 

On the other hand, writers such as Richards 
(2000) argue that dominant hypermedia models of 
electronic literacy are too limited, and rely too 
heavily on postmodern theories of representation 
and poststructuralist models which characterize 
writing and speaking as separate communication 
systems. Burbules (1997) similarly counters sug- 
gestions that the reading (“hyper-reading”) prac- 
tices required for hypertexts represent a postmodern 
break with old literacy traditions, reminding us that 
“there must be some continuity between this emer- 
gent practice and other, related practices with 
which we are familiar — it is reading, after all” (]] 3). 
He continues by emphasizing the importance of the 
contexts and social relations in which reading takes 
place and agrees that “significant differences in 
those contexts and relations mean a change in 
[reading] practice” (][ 2) — though he characterizes 
this change more as evolution than revolution. 

Further highlighting the connections between 
language, literacy, and socio-cultural context, 
Warschauer (1999) points to the inutility of such 
simple binary models of “old” vs. “new” literacies, 
arguing that the roots of mainstream literacy he in 
the “mastery of processes that are deemed valuable 
in particular societies, cultures and contexts” (p. 1). 
For this reason, he suggests, there will be no one 
electronic literacy, just as there is no one print 
literacy, and indeed, a number of studies point to the 
growing diversity of literacies, new and hybrid. For 
example, Cranny-Francis (2000) argues that users 
need a complex of literacies — old and new — to 
critically negotiate Internet communication. Dudfield 
(1999) agrees that students are increasingly engag- 
ing in what she calls “hybrid forms of literate 
behaviour” (f 1), while Schlickau (2003) specifi- 
cally examines prerequisite literacies that learners 
need in order to make effective use of hypertext 
learning resources. In a more applied study, Will- 
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iams and Meredith (1996) attempt to track develop- 
ment of electronic literacy in new Internet users. 

Together, these studies highlight current thinking 
on which sets of skills users may need in order to 
communicate effectively online. A further level of 
debate centres, however, on theoretical claims that 
different kinds of technologies may demand differ- 
ent “kinds” of literacy — a position that Lea (2004) 
and others critique as excessively deterministic. 
Murray (2000) similarly explores claims of techno- 
logically induced socio-cultural paradigm shifts in 
greater detail, arguing that technology does not 
impose new literacy practices and communities, but 
merely facilitates social and cultural changes — 
including changes in literacy practices — that have 
already begun. 

Cyberspace Contexts, Identities, and 
Discourses 

Research from other disciplines have meanwhile 
begun to amass evidence of the great diversity of 
social and cultural contexts that influence and con- 
stitute Internet- mediated discourses, lending weight 
to theoretical perspectives that emphasize the im- 
portance of social and cultural contexts of literacy 
and communicative practices. Galvin(1995) samples 
and explores what he calls the “discourse of 
technoculture”, and attempts to locate it in various 
social and political contexts. Gibbs (2000) considers 
a range of influences that have shaped Internet style 
and content, and the social implications of the phe- 
nomenon of cyberlanguage. Kinnaly ’ s (1997) didac- 
tic essay on “netiquette” is included here as an 
example of the ways in which “the rules” of Internet 
culture (including language and behaviour) are nor- 
malized, maintained, and manifested via specific 
communicative practices. Wang and Hong (1996) 
examine the phenomenon of flaming in online com- 
munications and argue that this behaviour serves to 
reinforce cyberculture norms, as well as to encour- 
age clear written communication. In the world of 
online education, Conrad (2002) reports on a code of 
etiquette that learners valued and constructed; these 
communally constructed “nice behaviours” contrib- 
uted to group harmony and community, she argues. 
Jacobson (1996) investigates the structure of con- 
texts and the dynamics of contextualizing communi- 



cation and interaction in cyberspace, while Gibbs 
and Krause (2000) and Duncker (2002) investigate 
the range of metaphors in use in the virtual world, 
and their cultural roots. Collot and Belmore (1996) 
investigate elements such as informativity , narrativity , 
and elaboration; Condon and Cech (1996) examine 
decision-making schemata, and interactional func- 
tions such as metalanguage and repetition; and 
Crystal (2001) examines novel genres of Internet 
communications. 

Internet language is also shaped by the relation- 
ships and identities of the communicators. For ex- 
ample, Voiskounsky’s (1998) reports on ways in 
which culturally determined factors (status, position, 
rank) impact “holding the floor and turn-taking rules” 
(p. 100) in Internet communications, and Paolillo 
(1999) describes a highly structured relationship 
between participants’ social positions and the lin- 
guistic variants they use in Internet Relay Chat. 
Other papers analyze crucial issues like the effects 
of emotion management, gender, and social factors 
on hostile types of communication within electronic 
chat room settings (Bellamy & Hanewicz, 1999) or 
compare the male-female schematic organization of 
electronic messages posted on academic mailing 
lists (Herring, 1996). De Oliveira (2003) assesses 
“politeness violations” in a Portuguese discussion 
list, concluding that in this context male communica- 
tors assert their traditional gender roles as “adjudi- 
cators of politeness” (f 1). Conversely, 
Panyametheekul and Herring (2003) conclude from 
a study of gender and turn allocation in a Web-based 
Thai chat-room that Thai women are relatively 
empowered in this context. Liu (2002) considers 
task-oriented and social-emotional-oriented aspects 
of computer-mediated communication, while Dery’s 
1994 anthology includes essays examining the com- 
munications of different cyberspace subcultures 
(hackers, technopagans) and the nature and function 
of cyberspace. A final group of contributions to the 
study of the language of cyberspace consider the 
implications of English-language domination of 
cyberspace and cyberculture (for extensive refer- 
ences, see Macfadyen et al., 2004). It is increasingly 
evident then, that the communicative contexts and 
socio-cultural configurations of online and networked 
communications are many and various, and deny 
simple generalization and classification. 
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FUTURE TRENDS 

Lest we imagine that the current landscape of “read- 
only” Internet and computer-mediated communica- 
tions — be it hypertext or e-mail — is static or estab- 
lished, it is important to recognize that technologies 
that are even more challenging to current concep- 
tions of online literacy and language have already 
appeared on the horizon. To date, readers of Internet 
content have almost no opportunity to create or 
modify online text, while a limited number of authors 
or “producers” control all content selection and 
presentation. New forms of hypertext, such as those 
promised by wikis (a Web site or other hypertext 
document collection that allows any user to add 
content, and that also allows that content to be edited 
by any other user) will blur these clear distinctions 
between author and reader, producer and consumer 
of online text or “content” (Graddol, 2004). While it 
is already understood that reading involves the pro- 
duction of meaning, new open-access technologies 
permit multiple reader-authors to register different 
interpretations and analysis directly within text, and 
participate in a new and dynamic collaborative pro- 
cess of co-construction of meaning. As Braga and 
Busnardo (2004) suggest, these developments will 
offer an entire new challenge to communicators, 
greater than simply the navigation of non-linear 
texts. Readers will increasingly face multiply- 
authored texts that exist in a condition of constant 
change, a situation that radically challenges our 
existing notions of how knowledge is now produced, 
accessed, and disseminated. 

CONCLUSION 

Individuals employ a range of language and literacy 
practices in their interactions with textual — and 
increasingly multimodal — interfaces, as they con- 
struct and exchange meaning with others; these 
processes are further influenced by the social and 
cultural contexts and identities of the communica- 
tors. At the communicative interface offered by the 
Internet and other computer-mediated networks, 
language, literacy practices and technologies can all 
be seen as what Vygotsky (1962) calls the “cultural 
tools” that individuals can use to mediate (but not 



determine) meaning. This suggests that any mean- 
ingful theoretical approach to understanding lan- 
guage on the Internet must not privilege technology 
over literacy, or vice versa, but must integrate the 
two. With this in mind, Lea (2004) suggests that 
activity theory, actor network theory, and the con- 
cept of “communities of practice” may be particu- 
larly helpful perspectives from which to consider the 
literacy and how it impacts Internet- and computer- 
mediated communications. Exemplifying new ap- 
proaches that are beginning to recognize the impor- 
tance of communicator learning, agency and adapta- 
tion, Lea particularly highlights Russell’s (2002) 
interpretation of activity theory — which focuses at- 
tention on “re-mediation” of meaning, or the ways in 
which individuals adopt new tools to mediate their 
communications with others — as a framework that 
may allow new explorations and understandings of 
the ways that humans adopt computers as new 
cultural tools in their interactions with each other. 
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KEY TERMS 

Cyberculture: As a social space in which hu- 
man beings interact and communicate, cyberspace 
can be assumed to possess an evolving culture or set 
of cultures (“cybercultures”) that may encompass 
beliefs, practices, attitudes, modes of thought, 
behaviours, and values. 

Cyberlanguage: The collection of communica- 
tive practices employed by communicators in 
cyberspace, and guided by norms of cyberculture(s). 

Cyberspace: While the “Internet” refers more 
explicitly to the technological infrastructure of net- 
worked computers that make worldwide digital com- 
munications possible, “cyberspace” is understood as 



the virtual “places” in which human beings can 
communicate with each other, and that are made 
possible by Internet technologies. Levy (200 1 ) char- 
acterizes cyberspace as “not only the material infra- 
structure of digital communications but... the oce- 
anic universe of information it holds, as well as the 
human beings who navigate and nourish that infra- 
structure.” 

(Technological) Determinism: The belief that 
technology develops according to its own “internal” 
laws and must therefore be regarded as an autono- 
mous system controlling, Permeating, and condition- 
ing all areas of society. 

Discourse: Characterized by linguists as units 
of language longer than a single sentence, such that 
discourse analysis is defined as the study of cohe- 
sion and other relationships between sentences in 
written or spoken discourse. Since the 1980s, how- 
ever, anthropologists and others have treated dis- 
course as “practices that systematically form the 
objects of which they speak” (Foucault, 1972, p. 49), 
and analysis has focused on discovering the power 
relations that shape these practices. Most signifi- 
cantly, the anthropological perspective on discourse 
has re-emphasized the importance of the context of 
communicative acts. 

Literacy: Traditionally defined as “the ability to 
read, write and communicate,” usually in a print- 
text-based environment. New literacy Studies re- 
searchers now view literacies as social practices, 
moving away from individual and cognitive-based 
models. This model of literacy recognizes that lan- 
guage does not simply represent some kind of objec- 
tive truth, but actually constitutes meaning in a given 
context; literacy, therefore, represents an individual’ s 
ability to communicate effectively within a given 
socio-cultural context. 

Postmodern: Theoretical approaches charac- 
terized as postmodern, conversely, have abandoned 
the belief that rational and universal social theories 
are desirable or exist. Postmodern theories also 
challenge foundational modernist assumptions such 
as “the idea of progress,” or “freedom.” 

Semiotics: The study of signs and symbols, both 
visual and linguistic , and their function in communi- 
cation. 
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INTRODUCTION 

The School of Nursing at the University of British 
Columbia has more than 300 nursing students en- 
gaged in supervised clinical practice in hospital and 
community settings around Vancouver. Likewise, 
the Faculty of Medicine has more than 200 medical 
students undertaking supervised clinical experience 
locally and remotely in the Prince George and 
V ancouver Island regions. The management of these 
clinical experiences and the promotion of learning 
while in an active clinical setting is a complex 
process. 

BACKGROUND 

Supporting the students at a distance while under- 
taking their clinical experience is particularly re- 
source-intensive. It requires the creation and main- 
tenance of good communication links with the clini- 
cal and administrative staff, active management, 
clinical visits from faculty, and the provision and 
management of remotely based resources. How- 
ever, there were few existing resources that helped 
to contextualize and embed clinical knowledge in the 
workplace in the practice setting (Landers, 2000). A 
technological solution was developed and imple- 
mented using several clinical applications designed 
for use on personal digital assistants (PDAs). 

MOBILE CLINICAL LEARNING 
TOOLS 

A suite of PDA-based tools were created for a pilot 
study with the involvement of nursing and medical 
students during the academic year of 2004-2005 to 
achieve the following objectives: 



• To demonstrate the potential use of mobile 
networked technologies to support and im- 
prove clinical learning. 

• To develop and evaluate a range of mobile 
PDA tools to promote reflective learning in 
practice and to engage students in the process 
of knowledge translation. 

• To develop and evaluate a suite of pedagogic 
tools that help contextualize and embed clinical 
knowledge while in the workplace. 

• To evaluate the value of networked PDA 
resources to help prevent the isolation of stu- 
dents while engaged in clinical practicum. 

The tools developed provide a mobile clinical 
learning environment incorporating an e-portfolio 
interface for the Pocket PC/Windows Mobile 
(Microsoft, 2004) operating system. They were 
implemented on i-mate PDAs equipped with GSM/ 
GPRS (Global System for Mobile Communications/ 
General Packet Radio Service; GSM World, 2002). 
This platform offered considerable flexibility for the 
project. It supported the use of cellular telephone 
connectivity and Pocket Internet Explorer Web 
browser (which has a full Internet browser with 
support for HTML, XML/XSL, WML,cHTML, and 
SSL); the i-mate device had sufficient memory for 
the storage of text, audio, image, and video data, with 
a large screen and a user-friendly interface with an 
integrated digital camera. 

The tools included a mobile e-portfolio (with a 
multimedia interface) designed to promote profes- 
sional reflection (Chasin, 2001; Fischer et al., 2003; 
Hochschuler, 2001; Johns, 1995; Kolb, 1984). These 
mobile learning tools were designed to promote the 
skills of documentation of clinical learning, active 
reflection, and also to enable students to immediately 
access clinical expertise and resources remotely. 
Community clinical placements are being used for 
the testing domain, as there are currently no restric- 
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tions on using cellular network technology in these 
areas, whereas this is currently restricted in acute 
hospital settings in British Columbia and many other 
parts of the world. 

THE PDA INTERFACE DESIGN 

The main interface to the clinical tools was based on 
a clinical e-tools folder on the Pocket PC containing 
icon-based shortcuts to a number of specific appli- 
cations (Figure 1). 

The clinical e-portfolio tool represented the ma- 
jor focus for the project, allowing the student to 
access clinical placement information; log clinical 
hours; achieve clinical competencies ; record portfo- 
lio entries in the form of text, pictures, or video clips; 
and record audio memos. This provides the user with 
a very adaptable interface, allowing them to choose 
how they input data. For example, a text-based entry 
describing a clinical procedure may be accompanied 
by a picture or audio memo. 

The e-portfolio tool also incorporates a reflective 
practice wizard promoting the students to work 
through the stages of the Gibbs reflective cycle 
(Gibbs, 1988) when recording their experiences. 
This wizard also allows students to record their 



Figure 1. Screenshot of the clinical e-tools folder 



experiences with multimedia, including text, audio, 
digital images, or video input. Once the data have 
been recorded in the e-portfolio, they can be syn- 
chronized wirelessly (using the built-in GSM/GPRS 
or Bluetooth connectivity) with a Web-based portfo- 
lio. The data then can be reviewed and edited by the 
student or by clinical tutors. 

The other icons represent the following applica- 
tions: 

• The synch portfolio icon initiates synchroniza- 
tion of the content of the student’s e-portfolio 
on the PDA with that of a remote server. 

• The University of British Columbia (UBC) 
library icon presents a shortcut to a Pocket 
Internet Explorer Web access to the UBC 
library bibliographic health care database search 
(CINAHL, Medline, etc.). 

• The Pocket Explorer icon presents a shortcut 
to Pocket Internet Explorer for mobile Web 
access. 

• The e-mail icon presents a shortcut to the 
Pocket PC mobile e-mail application. 

The other icons on the screen (Diagnosaurus, 
ePocrates, etc.) represent third-party clinical soft- 
ware that was purchased and loaded onto the PDAs 
in order to support the students learning in the clinical 
area (e.g. a drug reference guide). 
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FUTURE TRENDS 

In the future, the PDA will provide a one-stop 
resource to support clinical learning. Students also 
will be able to examine their learning objectives, 
record their achievements, and record notes/memos 
attached to specific clinical records for later review. 
Where students have particular concerns or ques- 
tions that cannot be answered immediately in the 
clinical area, they will be able to contact their 
supervisors or faculty for support using e-mail, cell 
phone, or multimedia messaging service (MMS) 
communications. 

The use of multimedia in PDA interfaces is likely 
to become much more widespread as the cost of 
these devices reduces and they become more ac- 
cessible to a wider spectrum of the population. This 
already is occurring with the merging of cell phone 
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and PDA technologies and the uptake of MMS and 
use of audio and video data entry on mobile devices 
(deHerra, 2003). 

In the long term, multimedia mobile learning tools 
will encourage a more structured process of profes- 
sional reflection among students in supervised clini- 
cal practice (Conway, 1994;Copaetal., 1999; Palmer 
et al., 1994; Reid, 1993; Sobral, 2000). When unex- 
pected learning opportunities arise, students will be 
able to quickly review online materials in a variety of 
formats and prepare for their experience, record 
notes, record audio memos or images during their 
practice, and review materials following their expe- 
rience. 

An expansion in the use of such mobile clinical 
learning tools is envisaged, and there is considerable 
scope for the widespread application of such tools 
into areas where students are engaged in work-based 
learning. We are likely to see the integration of these 
technologies into mainstream educational practice in 
a wide variety of learning environments outside of the 
classroom. 



CONCLUSION 

The value of these new tools to students in clinical 
practice remains to be demonstrated, as the evalua- 
tion stage of the project has yet to be completed. The 
project also has highlighted the necessity of address- 
ing some of the weaknesses of current PDA design, 
such as the small display screen and the need for 
more built-in data security. However, initial feedback 
appears promising, and the interface design appears 
to promote reflective learning in practice and engage 
students in the process of knowledge translation. 
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KEY TERMS 

Bluetooth: A short-range wireless radio stan- 
dard aimed at enabling communications between 
digital devices. The technology supports data trans- 
fer at up to 2Mbps in the 2.45GHz band over a 10m 
range. It is used primarily for connecting PDAs, cell 
phones, PCs, and peripherals over short distances. 

Digital Camera: A camera that stores images in 
a digital format rather than recording them on light- 
sensitive film. Pictures then may be downloaded to 
a computer system as digital files, where they can be 
stored, displayed, printed, or further manipulated. 

e-Portfolio: An electronic (often Web-based) 
personal collection of selected evidence from 
coursework or work experience and reflective com- 
mentary related to those experiences. The e-portfo- 
lio is focused on personal (and often professional) 
learning and development and may include artefacts 
from curricular and extra-curricular activities. 

General Packet Radio Service (GPRS): A 

standard for wireless communications that operates 
at speeds up to 1 1 5 kilobits per second. It is designed 
for efficiently sending and receiving small packets of 
data. Therefore, it is suited for wireless Internet 
connectivity and such applications as e-mail and 
Web browsing. 



Global System for Mobile Communications 
(GSM): A digital cellular telephone system intro- 
duced in 1991 that is the major system in Europe and 
Asia and is increasing in its use in North America. 
GSM uses Time Division Multiple Access (TDMA) 
technology, which allows up to eight simultaneous 
calls on the same radio frequency. 

i-Mate: A PDA device manufactured by Car- 
rier Devices with an integrated GSM cellular phone 
and digital camera. The device also incorporates a 
built-in microphone and speaker, a Secure Digital 
(SD) expansion card slot, and Bluetooth wireless 
connectivity. 

Personal Digital Assistant (PDA): A small 
handheld computing device with data input and 
display facilities and a range of software applica- 
tions. Small keyboards and pen-based input systems 
are commonly used for user input. 

Pocket PC: A Microsoft Windows-based oper- 
ating system (OS) for PDAs and handheld digital 
devices. Versions have included Windows CE, 
Pocket PC, Pocket PC Phone Edition, and Windows 
Mobile. The system itself is not a cut-down version 
of the Windows PC OS but is a separately coded 
product designed to give a similar interface. 

Wizard: A program within an application that 
helps the user perform a particular task within the 
application. For example, a setup wizard helps guide 
the user through the steps of installing software on 
his or her PC. 
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INTRODUCTION 

Our contention is that interactions between humans 
and computers have a moral dimension. That is to 
say, a computer cannot be taken as a neutral tool or 
a kind of neutral technology (Norman, 1993). 1 This 
conclusion seems a bit puzzling and surely paradoxi- 
cal. How can a computer be moral? 

All computational apparatuses can be generally 
considered as moral mediators, but for our consider- 
ations, computers are the best representative tools. 
First of all, they are the most widespread technologi- 
cal devices, they are relatively cheap in comparison 
to other technological utilities, and, very importantly, 
they can be easily interconnected all over the word 
through the Internet. This last feature allows people 
to keep in contact with each other and, consequently, 
to improve their relations. Computers require inter- 
actions with humans, but also allow interactions 
between humans. Since morality relates to how to 
treat other people within interactive behaviors, com- 
puters can help us to act morally in several ways. For 
instance, as the concept of moral mediators sug- 
gests, computers can help us to acquire new infor- 
mation useful to treat in a more satisfactory moral 
way other human beings. 

BACKGROUND 

In traditional ethics it is commonly claimed that the 
moral dimension primarily refers to human beings 
since they possess intentions, they can consciously 
choose, and they have beliefs. Also, artificial intelli- 
gence (AI) holds this view: Indeed, AI aims at 
creating a moral agent by smuggling and reproducing 



those features that make humans moral. On the 
contrary, our contention is that computer programs 
can also be considered moral agents even if their 
interfaces do not exhibit or try to explicitly reproduce 
any human moral feature. 2 As Magnani (2005) 
contends, computer programs can be defined as a 
particular kind of moral mediator. 3 More precisely, 
we claim that computers may have a moral impact 
because, for instance, they promote various kinds of 
relations among users, create new moral perspec- 
tives, and/or provide further support to old ones. 4 

MORAL MEDIATORS AND HCI 

In order to shed light on this issue, the concept of 
moral mediator turns out to be a useful theoretical 
device. To clarify this point, consider, for instance, a 
cell phone: One of its common features is to ask for 
confirmation before sending text. This option af- 
fords the user to check his or her message not only 
for finding mistyping, but also for reflecting upon 
what he or she has written. In other words, it affords 
being patient and more thoughtful. For instance, 
after typing a nasty text message to a friend, receiv- 
ing a confirmation message may affect a person’s 
behavior to wait and discard the text message. The 
software not only affords a certain kind of reaction 
(being thoughtful), but it also mediates the user’s 
response. The confirmation message functions as a 
mediator that uncovers reasons for avoiding the 
delivery of the text message. Just reading after a 
few seconds what one has furiously written may 
contribute to change one’s mind. That is, a person 
might think that a friend does not deserve to receive 
the words just typed. Hence, new information is 
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brought about. According to Magnani (2003), be- 
cause of this behavior, we may call this kind of 
device a moral mediator. 

Various kinds of moral mediators have been 
described that range from the role played by arti- 
facts to the moral aspects that are delegated to 
natural objects and human collectives. In order to 
grasp the role of moral mediators, let us consider 
Magnani’s example of endangered species. 5 When 
we consider animals as subjects requiring protection 
for their own existence, we are using them to depict 
new moral features of living objects previously 
unseen. In this case, endangered species can be- 
come a mediator that unearths and uncovers a new 
moral perspective expanding the notion of moral 
worth and dignity we can also attribute to human 
beings. 6 

AN EXAMPLE OF MORAL MEDIATOR: 
THE “PICOLA PROJECT” 

This section will provide an exemplification of the 
moral mediation previously illustrated. In the follow- 
ing, we shall give a general description of a Web- 
based tool named PICOLA (Public Informed Citizen 
On-Line Assembly). 7 The PICOLA project, devel- 
oped at Carnegie Mellon’s Institute for the Study of 
Information and Technology (InSITeS) and at the 
Center for the Advancement of Applied Ethics 
(CAAE), aims at implementing an online environ- 
ment for community consultation and problem solv- 
ing using video, audio, and textual communication. 

The appeal of deliberative democracy is mainly 
based on two ingredients: first, the idea of a free and 
equal discussion, and second, the consensus achieved 
by the force of the best argument (Fishkin & Laslett, 
2003; Habermas, 1994, 1998). PICOLA can be 
considered a moral mediator because it implements 
those two ideas into a W eb-based tool for enhancing 
deliberative democracy. Indeed, everyone has equal 
rights to speak and to be listened to, equal time for 
maintaining her or his position, equal weight in a poll, 
and so on. Besides this, it allows the formation of 
groups of discussion for assessing and deliberating 
about different issues. Within an actual framework, 
these two requirements are rarely matched. Even if 
everyone has the possibility to vote and be voted on, 
few persons can actually play a role in deliberative 



procedures. Web-based tools like PICOLA promote 
participation by allowing all interested citizens to be 
involved in a democratic process of discussion. It 
enables citizens to take part in democratic meetings 
wherever one may be. For instance, with PICOLA 
we do not need any actual location because it is 
possible to conceive virtual spaces where persons 
can discuss following the same rules. 

Every Web site has to face the problem of 
trustworthiness. In a physical environment, people 
have access to a great deal of information about 
others: how people are, dress, or speak, and so on. 
This may provide additional information, which is 
difficult to obtain in a virtual environment. The need 
for trust is more urgent especially when people have 
to share common policies or deliberations. 

PICOLA requires each user to create a profile 
for receiving additional information, as listed in 
Figure 1. As well as a user name and password, the 
user must insert the area where he or she lives, the 
issue he or she would like to discuss, and the role he 
or she wants to play: moderator, observer, or delib- 
erator. This tool also allows users to add a picture to 
their profiles. Moreover, each user can employ so- 
called emoticons in order to display current feelings. 

All these features work like moral mediators 
because they mediate the relation among users so 
they can get acquainted with each other. Being 
acquainted with each other is one of the most 
important conditions to enhance cooperation and to 
generate trust. Indeed, people are more inclined to 
reciprocate and to be engaged in cooperative behav- 
ior if they know to some extent the people they are 
interacting with. This diminishes prejudices and 
people are less afraid of the possible negative out- 
comes. In this sense, sharing user profiles could be 
viewed as a generator of social capital and trust 
(Putnam, 2000). 

Each issue to be discussed during a session is 
introduced by an overview that provides general 
information (see Figure 2 on the issue of national 
security). Text is integrated with audio and video 
performances. The moral mediation mainly occurs 
for two reasons. First, people are sufficiently in- 
formed to facilitate sharing a starting point and 
giving up prejudices that may have arisen due to a 
lack of information. Second, the fact that different 
perspectives are presented helps users to weigh and 
consider the various opinions available. Moreover, 
the multimedia environment provides video and au- 
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Figure 1. User profile 




Figure 2. Discussion of the issues 




► Security requires the protection of U. S 
and cHi/ens abroad 



► A third view connects American we * being 
w>th that of others around the globe 



► Finally, many have come to see U S. 
socunty as a domestic as we« as a foreign 
policy issue 



dio information so that the user may access additional 
resources. For instance, watching a video involves 
not only “cold” reasons, but also emotions that medi- 
ate the user’s response to a particular issue. 8 

The discussion group is realized by a Web-based 
tool, which provides the possibility to visualize all the 
discussants sitting at a round table. The discussion is 
led by a moderator who has to establish the order of 



speakers. Each participant is represented with a 
profile showing his or her data, an emoticon, and a 
picture. Everyone has basically two minutes for 
maintaining a position and then has to vote at the end 
of the discussion (see Figure 3). 

Computers can afford civic engagements (Davis, 
Elin, & Reeher, 2002). Creating a public online 
community is not just a way for allowing interaction 
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Figure 3. Group discussion 



Participate in Group Discussion 




among people: It can improve the overall assessment 
of public interest policies, giving people more infor- 
mation, generating a public base for discussion, and 
implementing participation in the democratic life of 
a country. This is possible and welcome if we 
assume that the computer is, or should be, a very 
low-cost technology so that a larger amount of 
persons can exploit such a service. 

FUTURE TRENDS 

Moral mediators are widespread in our world and 
especially in our technological world. The moral- 
mediating role of human-computer interaction is just 
one of the possible interesting cases. However, 
other issues remain to be discussed and deepened. 

First of all, the concept of moral mediator implic- 
itly assumes that the effort of a moral or political 
deliberation is shared between users and computers. 
It could be useful to explore the details on how this 
distribution actually takes place to enhance the 
ethical mediation and/or to understand possible nega- 
tive effects of the technological tools. 

Second, an interesting line of research concerns 
the kind of moral affordance involved in such inter- 
action. As we have argued, objects apparently are 
not relevant from the moral point of view, namely, 
they are inert. We are acquainted with the idea that 
a human being is morally important because of his or 



her intrinsic value and dignity. But how could a 
computer be intrinsically moral? Through the con- 
cept of moral mediator, we have suggested that 
nonhuman things can be morally useful in creating 
new moral devices about how to behave (the cell- 
phone example or certain kinds of interaction with 
computer programs). Hence, their features afford a 
moral use. The concept of affordance, first intro- 
duced by Gibson (1979), might help to clarify this 
point about which features transform an object or 
artifact into a moral mediator. 



CONCLUSION 

In this article we have tried to show some aspects of 
the kind of moral mediation that is involved in human- 
computer interaction. We have argued that objects 
that seem inert from a moral point of view may 
contribute to enhance moral understanding and be- 
havior. Tools and software may have moral impact 
because, for instance, they promote new kinds of 
relations among users, create new moral perspec- 
tives, and/or provide further support to old ones. 

We have illustrated this approach showing how a 
Web-based tool such as PICOLA may provide 
useful support for enhancing deliberative democ- 
racy. As we have seen, a multimedia environment 
may be very useful in order to create consensus, to 
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inform citizens about a given issue, and to develop 
considered beliefs. 
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KEY TERMS 

Civic Engagement: Describes the level of citi- 
zens ’ participation in all those activities that concern 
with fostering democratic values and public virtues 
such as trustworthiness, freedom of speech, and 
honesty. 

Deliberative Democracy: Based on a deci- 
sion-making consensus-oriented process, where 
parties can freely participate. The outcome of which 
is the result of reasoned and argumentative discus- 
sions. This model aims to achieve an impartial 
solution for political problems. 

Morality: The complex system of principles 
given by cultural, traditional, and religious concep- 
tions and beliefs, which human-beings employ for 
judging things as right or wrong. 

Moral Agency: The capacity to express moral 
judgements and to abide by them. 

Moral Mediators: External resources (arti- 
facts, tools, etc.) can be defined as moral mediators, 
when they actively shape the moral task one’s 
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facing through uncovering valuable information that 
otherwise would remain unearthed and unattainable. 

Online Community: A system of Internet users 
sharing interests and interacting frequently in the 
same areas, such as forums and chat Web sites. 

PICOLA Project: The PICOLA Project (Pub- 
lic Informed Citizen On-Line Assembly) is an initia- 
tive spearheaded by Carnegie Mellon University to 
develop and implement through on-line tools, a vir- 
tual agora for public consultation regarding public 
policy issues. 

Social Capital: Refers to connections among 
individuals — social networks and the norms of reci- 
procity and trustworthiness that arise from them 
(Putnam, 2002, p. 19). 



ENDNOTES 

1 Although Norman (2004) did not investigate 
the moral dimension of HCI (human-computer 
interaction), he has recently explored its emo- 
tional dimension. 



For more information about the relation be- 
tween computers and ethics, see Ermann and 
Shauf (2002) and Johnson (2000). 

The idea of moral mediators is derived from the 
analysis of the so-called epistemic mediators 
Magnani (200 1 ) introduced in a previous book. 
Even if a tool or software can provide support 
or help, it may also contribute to create new 
ethical concerns. One of the most well-known 
problems related to the Web, and to the Inter- 
net in general, is the one concerning privacy. 
For further information on this issue, see 
Magnani (in press). For more information, visit 
the Electronic Privacy Information Center 
(EPIC) at http://www.epic.org/. 

For further details about this issue, see Kirkman 
(2002) and Nagle (1998). 

More details are in Magnani (2005, chs. 1 and 

6). 

Further information about the PICOLA project 
can be found at http://communityconnections. 
heinz.cmu.edu/picola/index.html. 

On the role of emotions, see Norman (2004), 
Picard and Klein (2002), and Tractinsky, Katz, 
and Ikar (2000). 
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INTRODUCTION 

Information systems are designed for the people, by 
the people. The design of software systems with the 
help of software systems is another aspect of hu- 
man-computer interfaces. New methods and their 
(non-)acceptance play an important role. Motiva- 
tional factors of systems developers considerably 
influence the type and quality of the systems they 
develop (Arbaoui, Lonchamp & Montangero, 1999; 
Kumar & Bjoern-Andersen, 1990). To some extent, 
the quality of systems is a result of their developers’ 
willingness to accept new and (supposedly) better 
technology (Jones, 1995). A typical example is 
component-based development methodology 
(Bachmannetal., 2000; Cheesman& Daniels, 2001). 
Despite considerable publication effort and public lip 
service, component-based software development 
(CBD) appears to be getting a slower start than 
anticipated and hoped for. One key reason stems 
from the psychological and motivational attitudes of 
software developers (Campell, 2001; Lynex & 
Layzell, 1997). We therefore analyze the attitudes 
that potentially hamper the adoption of the compo- 
nent-based software development approach. 
Maslow’s Hierarchy of Need (Boeree, 1998; Maslow, 
1943) is used for structuring the motives. 

BACKGROUND 

The Human Side of Software 
Engineering 

Kunda and Brooks (1999) state that “software sys- 
tems do not exist in isolation ... human, social and 
organizational considerations affect software pro- 
cesses and the introduction of software technology. 
The key to successful software development is still 
the individual software engineer” (Eason et al., 



1974; Kraft, 1977; Weinberg, 1988). Different soft- 
ware engineers may account for a variance of 
productivity of up to 300% (Glass, 2001). On the 
other hand, any other single factor is not able to 
provide an improvement of more than 30%. The 
influence of an individual’ s motivation, ability, pro- 
ductivity, and creativity has the biggest influence by 
far on the quality of software development, irrespec- 
tive of the level of technological or methodological 
support. Therefore, it is worthwhile investigating for 
what reasons many software engineers do not 
fullheartedly accept component-based methods 
(Lynex & Layzell, 1997). 

Software development in general introduced a 
new type of engineers who show marked differ- 
ences when compared to (classical) engineers 
(Badoo & Hall, 2001; Campell, 2001; Eason et al., 
1974; Kraft, 1977; Kunda & Brooks, 1999; Lynex & 
Layzell, 1997). The phenomenon is not fully under- 
stood yet but seems to have to do with the peculiari- 
ties of software (Brooks, 1986), the type of pro- 
cesses and environments needed to develop soft- 
ware (Kraft, 1977), and especially to the proximity 
of software development to other mental processes 
(Balzert, 1996). 

Maslow’s Hierarchy of Needs 

Maslow’s theory (Boeree, 1998; Huitt, 2002; Maslow, 
1943; McConnell, 2000) provides a practical classi- 
fication of human needs by defining a five-level 
Hierarchy of Needs (Ligure 1). 

The five levels are as follows: 

• Basic Physiological Needs (Survival): At 

this level, the individual is fighting for survival 
against an adverse environment, trying to avert 
hunger, thirst, cold, and inconvenient and de- 
tracting physical work environments. 

• Security (Physical, Economic ...): On this 
level, the individual is concerned with the sta- 
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Figure 1. Maslow’s hierachy of needs 



self-fulfillment 
recognition 
social environment 

security (physical, economic ) 

basic physiological needs (survival) 



bility of his or her future and the safety of the 
environment. Worries include job security, loss 
of knowledge, loss of income, health, and so 
forth. 

• Social Environment: This category includes 
the need to have friends, belong to a group, and 
to give and receive love. 

• Recognition: Individuals strive to receive ap- 
propriate recognition and appreciation at work 
and to be recognized as having a valuable 
opinion. 

• Self-Fulfillment: This level is considered the 
highest stage attainable in the development of 
a person, drawing satisfaction from the realiza- 
tion of one’s own contribution to a goal and 
one’s fullfillment of their full potential as a 
human being. 

Reuse and Component-Based 
Software Development (CBD) 

An old dream in software development is to avoid 
unnecessary duplication of work by consistently and 
systematically reusing existing artifacts. Reuse prom- 
ises higher productivity, shorter time-to-market, and 
higher quality (Allen, 2001; Cheesman & Daniels, 
2001). Initially, ready-made pieces of software were 
made available; these delivered a defined function- 
ality in the form of a black box (i.e. , without divulging 
the internal structure to the buyer/user). They were 
called COTS (commercials off the shelf) (Voas, 
1998). Later, an improved and more restricted con- 
cept was employed: software components (Bachmann 
et al., 2000; Cheesman & Daniels, 2001; Woodman 
et al., 2001). Software components have to fulfill 
additional requirements, restrictions, and conven- 
tions beyond the properties of COTS. To a user of a 
software component, only its interfaces and func- 
tionality are known, together with the assurance that 



the component obeys a specific component model. 
This component model defines how the component 
can be integrated with other components, the con- 
ventions about the calling procedure, and so forth. 
The internal structure, code, procedures, and so 
forth are not divulged — it is a black box. 

Systematic, institutionalized CBD needs a change 
in the attitude of software engineers, different work 
organization, and a different organization of the 
whole enterprise (Allen, 2001). 

Component-Based Development and 
Software Engineers’ Needs 

The acceptance of a new technology often meets 
with strong opposition caused by psychological mo- 
tives, which can be traced to Maslow’ s Hierarchy of 
Needs. 

Basic Physiological Needs 

This level does not have any strong relevance; 
software engineering is a desk-bound, safe, non- 
endangering activity. We have to recognize, how- 
ever, that very often software engineers have to 
struggle with adverse infrastructure (floor space, 
noise, etc.) (deMarco, 1985). 

Security 

The desire for security is threatened by numerous 
factors. The fears can be categorized into four 
groups: 

Losing the Job or Position 

• Job Redundancy: CBD promises consider- 
ably higher productivity and less total effort as 
a result of removing the redundancy of 
reimplementing already existing functions. This 
carries the thread of making an individual re- 
dundant, especially since the development of 
components very often is outsourced to some 
distant organization (e.g., India). 

• Implementing vs. Composing: deRemer 
(1976) stressed the difference between imple- 
menting a module/component (programming in 
the small) and building (composing) a system 
out of components (programming in the large). 
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He emphasized the need for a different view 
and for new approaches and tools. Program- 
ming in the large needs a systems view, making 
much of long-learned patterns of work obsolete, 
even counter-productive. 

• Changed Job Profile: The necessity of inte- 
grating existing components requires a different 
mindset than one implementing some program 
from scratch (Vitharana, 2003). Does the soft- 
ware engineer have the ability or qualifications 
to fulfill the new profile of an integrator? 

• Loss of Knowledge and “Guru” Status: In 
traditional development, considerable domain 
know-how and low-level development know- 
how rests in the heads of seasoned developers 
having developed software for many years. The 
use of components encapsulates and hides both 
implementation details and domain know-how. 
In addition, system development methods 
change. Much of the accumulated experience 
and know-how previously valuable to the em- 
ploying institution becomes irrelevant. 

• De-Skilling: In addition, certain de-skilling takes 
place at the lower level of software develop- 
ment (Kraft, 1977). The need for increased 
skills with respect to performing high-level com- 
position activities often is not recognized and 
appreciated by the individuals. 

Loss of Low-Level Flexibility 

• Pre-Conceived Expectations: Components 
often do not provide exactly what the original 
requirements specified. A developer then is 
challenged to find a compromise between the 
user requirements and the available compo- 
nents — a job profile dramatically different from 
developing bespoke software (Vitharana, 2003). 
Engineers also have to live with good enough 
quality (Bach, 1997; ISO/IEC, 2004), as pro- 
vided by the components and often are not able 
to achieve best quality. This is often difficult to 
accept emotionally. 

• Revision of Requirements: Mismatches be- 
tween stated requirements and available com- 
ponents make it necessary to revise and adapt 
requirements (Vitharana, 2003). Requirements 
are no longer set in stone, in contrast to the 



assumptions of classical development meth- 
ods (e.g., the waterfall model). 

• Uncertainty About Functionality of Com- 
ponents: By definition, the internal structure 
of a component is not revealed (Bachmann et 
al., 2000). Consequently, the developer has to 
rely on the description and claims provided by 
the component provider (Crnkovic & Larsson, 
2002; Vitharana, 2003). 

Lack of Confidence 

• Distrust in Component Quality: Quality 
problems experienced in the past have created 
a climate of distrust with respect to other 
developers’ software products. This feeling 
of distrust becomes stronger with respect to 
components, because their internals are not 
disclosed (Heineman. 2000; Vitharana, 2003). 
This situation becomes worse for so-called 
software of unknown provenance (SOUP) 
(Schoitsch, 2003). The current interest in open 
source programs is an indicator of a move- 
ment in the opposite direction. 

• Questions About Usability: Besides the 
issue of quality, problems with portability and 
interoperability of components, as often expe- 
rienced with COTS, also reduce the confi- 
dence in using components (Vecellio & Tho- 
mas, 2001). 

• Loss of Control of System: Engineers usu- 
ally like to understand fully the behavior of the 
designed system (the why) and exercise con- 
trol over the system’s behavior (the how). In 
CBD, due to the black-box character of the 
components, understanding and control can be 
achieved only to a limited extent, leaving a 
vague feeling of uncertainty. 

Effort for Reuse vs. New Development 

• Uncertainty Concerning the Outcome of 
the Selection Process: Successful CBD 
depends to a considerable extent on the prob- 
ability and effectiveness of finding a compo- 
nent with (more or less) predefined properties 
(Vitharana, 2003). Occasionally, this search 
will not be successful, causing a delay in the 
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development schedule and some lost effort 
spent in the search. 

• Effort Estimates: In general, software engi- 
neers underestimate the effort needed to build 
a system from scratch and overestimate the 
cost and effort of adapting a system. The 
reasons seem to be the initially necessary ef- 
fort to achieve a certain familiarity with the 
whole system before making even small adap- 
tations, the learning curve (Boehm & Basili, 
2000), the difficulty, and often also the unwill- 
ingness of becoming familiar with somebody 
else’s thoughts and concepts (not-invented- 
here syndrome). 

Social Environment 

• Reluctance to Utilize Outside Intellectual 
Property: Our society extends the notion of 
ownership to immaterial products like ideas 
and intellectual achievements. They are pro- 
tected by copyright, trademark, and patent 
legislation. Plagiarism is objected to and usu- 
ally not sanctioned (Kock, 1999; Sonntag & 
Chroust, 2004). Reusing someone else’s ideas 
is often deemed inappropriate. 

• Immorality of Copying: In school, copying as 
a form of reuse and teamwork is usually dis- 
couraged. This might later cause some reluc- 
tance to actively share knowledge and to make 
use of someone else’s achievements (Disterer, 
2000 ). 

• Adopting a New Technology: The adoption 
of a new technology seems to follow an expo- 
nential law (Jones, 1995). It starts with a few 
early adopters, and others follow primarily be- 
cause of personal communication. The ten- 
dency of software developers to be introverts 
(Riemenschneider, Hardgrave, & Davis, 2002) 
might delay such a dissemination process. 

• Change of Work Organization: CBD needs 
a different work organization (Allen, 2001; 
Chroust, 1996; Cusumano, 1991; Wasmund, 
1995) resulting in rearranged areas of respon- 
sibility, power distribution, and status, poten- 
tially upsetting an established social climate 
and well-established conventions. 



Recognition 

A strong motivator for an individual is recognition by 

the relevant social or professional reference group, 

usually a peer group (Glass, 1983). 

• Gluing vs. Doing: On the technical level, 
recognition usually is connected to a particular 
technical achievement. Gluing together exist- 
ing components will achieve recognition only 
for spectacular new systems — and these are 
rare. Similarly, an original composer will gain 
recognition; whereas a person simply arrang- 
ing music into potpourris usually goes unrecog- 
nized. 

• Shift of Influence and Power: Successful 
CBD needs a change in organization (Allen, 
200 1 ; Kunda & Brooks, 1 999), making persons 
gain or lose influence, power, and (job) pres- 
tige, a threat to the established pecking order. 

• The CBD Water Carrier: Organizations 
heavily involved in CBD often separate the 
component development from component de- 
ployment (Cusumano, 1991). Component de- 
velopment to a large extent is based on making 
existing modules reusable “components as you 
go” and “components by opportunity” (Allen, 
2001) and not on creating new ones from 
scratch “components in advance”. Jobs in the 
reuse unit (similar to maintenance units) (Basili, 
1990) might be considered to require less know- 
how and thus receive lower prestige, despite 
the fact that these jobs often require greater 
know-how and experience than designing com- 
ponents from scratch. 

• Contempt for the Work of Others: The 

inherent individuality of software development, 
together with a multitude of different solutions 
to the same problem (i.e., there is always a 
better way), and the low quality of many soft- 
ware products have tempted many software 
developers into contempt for anyone else’s 
methods and work (not-invented-here syn- 
drome). 

• Rewarding Searching Over Writing: As 

long as the amount of code produced (lines of 
code) is a major yard stick for both project size 
and programmer productivity searching, find- 
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ing and incorporating a component will be less 
attractive than writing it anew. 

• Accounting for Lost Search Effort: There is 
no guarantee that even after an extensive (and 
time-consuming) search an appropriate com- 
ponent can be found (Vitharana, 2003). In this 
case, management must accept these occa- 
sional losses so as not to discourage searching 
for components (Fichman & Kemerer, 2001). 

Self-Fulfillment 

• Not Invented Here: The ability to design 
wonderful systems is a strong motivator for 
software engineers. This feeling goes beyond 
recognition of peers — one knows it oneself. 
This makes it difficult for developers to accept 
other people’s work (Campell, 2001; Disterer, 
2000) in the form of components. 

• No More Gold Plating: The feeling of self- 
fulfillment often cannot live with the knowl- 
edge that a system still should or must be 
improved, leading to endless effort in gold 
plating a system before delivery (or even there- 
after). Externally acquired components cannot 
be modified (i.e., gold plated) because of the 
inaccessibility of their code. 

• No Creative Challenge: Gluing together com- 
ponents provided by somebody else does not 
fulfill many engineers’ attempt for novelty and, 
thus, is not considered to be a creative chal- 
lenge. The highly creative process of finding 
the best-fitting component, restructuring the 
system, and perhaps modifying the require- 
ments for using existing components often is 
not appreciated. 

• No More Lone Artists: Software engineers 
aspire to become a Beethoven or a Michelangelo 
and not the directors of a museum arranging a 
high-class exhibition. Someone remarked that 
many system features are not needed by the 
users but are just a monument of their designer’s 
intellectual capability. Assembling components 
utilizes only someone else’s achievement. 

• Lack of Freedom: The limited choice of 
available components, the limitations of a com- 
ponent model, the need to obey predefined 
interfaces, and so forth restrict the freedom of 



development and often are seen as a limit to 
creativity. 

FUTURE TRENDS 

The fact that the software industry needs a large 
step forward with respect to productivity, quality, 
and time-to-market will increase the reuse of soft- 
ware artifacts and, as a consequence, will encour- 
age the use of component-based development meth- 
ods. Understanding the basic state of emotion of 
software developers will support efforts to over- 
come developers’ reluctance to accept this method- 
ology by emphasizing challenges and opportunities 
provided by the new methods, re-evaluating the 
importance and visibility of certain tasks, changing 
job profiles, and changing the reward and recogni- 
tion structure. 

The consequence might be that software design- 
ers wholeheartedly accept component-based meth- 
odologies not only as an economic necessity but also 
as a means of achieving the status of a great 
designer, as postulated by Brooks (1986). In turn, 
this could lead to a new level of professionalism in 
software development and would allow component- 
based development methods to be utilized fully in the 
field of software engineering. 

CONCLUSION 

Soft factors like motivation and psychological as- 
pects often play a strong role even in a technical, 
seemingly rational field like software engineering. 
We have discussed and identified key soft factors 
that often account for the slow uptake of compo- 
nent-based software development methods and re- 
late them to the framework of Maslow’s Hierarchy 
of Needs. The users of software components were 
the focus of this discussion. There are some indica- 
tions that for providers of components, a different 
emotional situation exists (Chroust & Hoyer, 2004). 

Recognition of the different levels of resistance 
and their psychological background will, among other 
aspects, allow approaching the problems in a psy- 
chologically appropriate form. The need of the soft- 
ware industry to come to terms with its problems of 
quality, cost, and timeliness makes this a necessity. 
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KEY TERMS 

Commercial Off the Shelf (COTS): Software 
products that an organization acquires from a third 
party with no access to the source code and for 
which there are multiple customers using identical 
copies of the component. 

Component-Based Development (CBD): In 

contrast to classical development (waterfall-pro- 
cess and similar process models), CBD is concerned 
with the rapid assembly of systems from compo- 
nents (Bachmann et al., 2000) where: 

• components and frameworks have certified 
properties; and 



420 



Motivation in Component-Based Software Development 



• these certified properties provide the basis for 
predicting the properties of systems built from 
components. 

Component Model: A component model speci- 
fies the standards and conventions imposed on de- 
velopers of components. This includes admissible 
ways of describing the functionality and other at- 
tributes of a component, admissible communication 
between components (protocols), and so forth. 

Maslow’s Hierarchy of Needs: Maslow’s 
Hierarchy of Needs (Boeree, 1998; Maslow, 1943; 
McConnell, 2000) is used as a structuring means for 
the various factors. It defines five levels of need: 

• self-fulfillment 

• recognition 

• social environment (community) 

• basic physiological needs (survival) 

• security (physical, economic ...) 

In general, the needs of a lower level must be 
largely fulfilled before needs of a higher level arise. 



Soft Factors: This concept comprises an ill- 
defined group of factors that are related to people, 
organizations, and environments like motivation, 
morale, organizational culture, power, politics, feel- 
ings, perceptions of environment, and so forth. 

Software Component: A (software) compo- 
nent is (Bachmann et al., 2000): 

• an opaque implementation of functionality 

• subject to third-party composition 

• in conformance with a component model 

Software Engineering: (1) The application of a 
systematic, disciplined, quantifiable approach to de- 
velopment, operation, and maintenance of software; 
that is, the application of engineering to software and 
(2) the study of approaches as in (1) (Abran, Moore, 
Bourque, Dupuis & Tripp, 2004). 
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INTRODUCTION 

Usability has become a critical quality factor in 
software systems, and it has been receiving increas- 
ing attention over the last few years in the SE 
(software engineering) field. HCI techniques aim to 
increase the usability level of the final software 
product, but they are applied sparingly in mainstream 
software development, because there is very little 
knowledge about their existence and about how they 
can contribute to the activities already performed in 
the development process. There is a perception in 
the software development community that these 
usability-related techniques are to be applied only 
for the development of the visible part of the UI 
(user interface) after the most important part of the 
software system (the internals) has been designed 
and implemented. 

Nevertheless, the different paths taken by HCI 
and SE regarding software development have re- 
cently started to converge. First, we have noted that 
HCI methods are being described more formally in 
the direction of SE software process descriptions. 
Second, usability is becoming an important issue on 
the SE agenda, since the software products user 
base is ever increasing and the degree of user 
computer literacy is decreasing, leading to a greater 
demand for usability improvements in the software 
market. However, the convergence of HCI and SE 
has uncovered the need for an integration of the 
practices of both disciplines. This integration is a 
must for the development of highly usable systems. 



In the next two sections, we will look at how the 
SE field has viewed usability. Following upon this, 
we address the existing approaches to integration. 
We will then detail the pending issues that stand in 
the way of successful integration efforts, concluding 
with the presentation of an approach that might be 
successful in the integration endeavor. 

Traditional View of Usability in 
Software Engineering 

Even though usability was mentioned as a quality 
attribute in early software quality taxonomies 
(Boehm, 1978; McCall, Richards, & Walters, 1977), 
it has traditionally received less attention than other 
quality attributes like correctness, reliability, or effi- 
ciency. While the development team alone could 
deal with these attributes, a strong interaction with 
users is required to cater for usability. With SE’ s aim 
of making the development a systematic process, the 
human-induced unpredictability was to be avoided at 
all costs, thus reducing the interaction with users to 
a minimum. 

The traditional relegation of usability in SE can be 
acknowledged by observing how interaction design 
is marginally present in the main software develop- 
ment process standards: ISO/IEC Std. 12207 (1995) 
and IEEE Std. 1074(1997). The ISO 12207 standard 
does not mention usability and HCI activities di- 
rectly. It says that possible user involvement should 
be planned, but this involvement is circumscribed to 
requirements setting exercises, prototype demon- 
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strations, and evaluations. When users are men- 
tioned, they play a passive role in the few activities 
in which they may participate. The IEEE standard 
1074 only mentions usability in connection with UI 
requirements and risk management. Neither of the 
two standards addresses any of the activities needed 
to manage the usability of the software product. 

Recent Changes Regarding Usability 
Awareness 

There has been a noticeable shift in the attention 
paid to usability in the SE field in recent years, since 
important overlapping areas have been identified in 
the SWEBOK (Guide to the Software Engineering 
Body of Knowledge) (IEEE Software Engineering 
Coordinating Committee, 2001), for example, which 
is an effort to gather what is considered commonly 
accepted knowledge in the SE field. The SWEBOK 
requirements engineering knowledge area includes 
some techniques that are not identified by the au- 
thors as belonging to HCI, but they are indeed 
standard HCI techniques: interviews, scenarios, 
prototyping, and user observation. Additionally, good 
communication between system users and system 
developers is identified as one of the fundamental 
tenets of good SE. Communication with users is a 
traditional concern in HCI, so this is an overlapping 
area between HCI and SE. Usability is mentioned as 
part of the quality attributes and highlighted in the 
case of high dependability systems. It is also men- 
tioned with regard to the software testing knowledge 
area. The work by Rubin (1994) is listed as part of 
the reference material for usability testing. 

The approval of Amendment 1 to standard ISO/ 
IEC 12207 (ISO/IEC, 2002), which includes a new 
process called Usability Process, by the ISO in 2002 
represented a big change regarding the relevance of 
usability issues in the SE field. With the release of 
this amendment, the ISO recognized the importance 
of managing the usability of the software product 
throughout the life cycle. The main concepts in a 
human-centered process, as described in the ISO 
13407 standard (1999), are addressed in the newly 
created usability process. The approach taken is to 
define in the usability process the activities to be 
carried out by the role of usability specialist. Some 
activities are the sole responsibility of the usability 
specialist, while others are to be applied in associa- 



tion with the role of developer. The first activity in 
the usability process is 6.9.1, Process implementa- 
tion, which should specify how the human-centered 
activities fit into the whole system life cycle process, 
and should select usability methods and techniques. 
This amendment to the ISO/IEC 12207 standard 
highlights the importance of integrating usability 
techniques and activities into the software develop- 
ment process. 

The fact that an international standard considers 
usability activities as part of the software develop- 
ment process is a clear indication that HCI and 
usability are coming onto the SE agenda, and that 
integrating HCI practices into the SE processes is a 
problem that the software development community 
needs to solve quite promptly. 

BACKGROUND 

This section details existing integration proposals. 
We will only consider works that are easily acces- 
sible for software practitioners, that is, mainly books, 
since average software practitioners do not usually 
consider conference proceedings and research jour- 
nals as an information source. 

Only a few of the numerous HCI methods give 
indications about how to integrate the usability ac- 
tivities with the other activities in the overall soft- 
ware development process. Of these works, some 
just offer some hints on the integration issue 
(Constantine & Lockwood, 1999; Costabile, 2001; 
Hix & Hartson, 1993), while others are more de- 
tailed (Lim & Long, 1994; Mayhew, 1999). 

Hix and Hartson (1993) describe the communi- 
cation paths that should be set up between usability 
activities (user interaction design) and software 
design. They strictly separate the development of 
the UI from the development of the rest of the 
software system, with two activities that connect the 
two parts: systems analysis and testing/evaluation. 
The systems analysis group feeds requirements to 
both the problem domain design group and the user 
interaction design group. It is a simplistic approach 
to HCI-SE integration, but the authors acknowledge 
that “research is needed to better understand and 
support the real communication needs of this com- 
plex process” (p. 112). 
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Constantine and Lockwood (1999) offer some 
advice for integrating usability and UI design into the 
product development cycle, acknowledging that there 
is no one single way of approaching this introduction. 
Therefore, they leave the issue of integration to be 
solved on a case-by-case basis. They state that 
“good strategies for integrating usability into the life 
cycle fit new practices and old practices together, 
modifying present practices to incorporate usability 
into analysis and design processes, while also tailor- 
ing usage-centered design to the organization and its 
practices” (p. 529). 

Costabile (200 1 ) proposes a way of modifying the 
software life cycle to include usability. The basis 
chosen for such modifications is the waterfall life 
cycle. The choice of the waterfall life cycle as a 
“standard” software life cycle is an important draw- 
back of Costabile’s proposal, as it goes against the 
user-centered aim of evaluating usability from the 
very beginning and iterating to a satisfactory solution. 
In the waterfall life cycle, paths that go back are 
defined for error correction, not for completely chang- 
ing the approach if it proves to be wrong, since it is 
based on frozen requirements (Larman, 200 1 ). Glass 
(2003) acknowledges that “requirements frequently 
changed as product development goes under way 
[...]. The experts knew that waterfall was an 
unachievable ideal” (p. 66). 

MUSE (Lim & Long, 1994) is a method for 
designing the UI, and the work by Lim and Long 
includes its detailed integration with the JSD (Jack- 
son System Development) method. The authors state 
that MUSE, as a structured method, emphasizes a 
design analysis and documentation phase prior to the 
specification of a “first-best-guess” solution. There- 
fore, MUSE follows a waterfall approach, not a truly 
iterative approach. Regarding its integration with 
other processes, JSD is presented in this work as a 
method that is mainly used for the development of 
real-time systems. Real-time systems account for a 
very small part of interactive systems, so the integra- 
tion of MUSE with JSD is not very useful from a 
generic point of view. Additionally, structured design 
techniques like structured diagrams or semantic nets 
make it difficult to adapt to processes based on other 
approaches, in particular to object-oriented develop- 
ment. 

Mayhew (1999) proposes the Usability Engineer- 
ing Life cycle for the development of usable UIs. 



This approach to the process follows a waterfall life 
cycle mindset: an initial Analysis phase, followed by 
a Design/Test/Development phase, and finally an 
Installation phase. The Analysis stage is only re- 
turned to if not all functionality is addressed, and this 
is, therefore, not a truly iterative approach to soft- 
ware development. Nevertheless, it is one of the 
more complete HCI processes from the SE point of 
view. Although Mayhew claims that the method is 
aimed at the development of the UI only, the 
activities included in this life cycle embrace an 
important part of requirements-related activities 
(like, for example, contextual task analysis). Links 
with the OOSE (object-oriented software engi- 
neering) method (Jacobson, Christerson, Jonsson, 
& Overgaard, 1993) and with rapid prototyping 
methods are identified, but the author acknowl- 
edges that the integration of usability engineering 
with SE must be tailored and that the overlap 
between usability and SE activities is not com- 
pletely clear. Accordingly, Mayhew presents UI 
development as an activity that is quite independent 
from the development of the rest of the system. 

PENDING ISSUES FOR 
INTEGRATION 

Having studied the existing integration proposals 
and considering the most widespread perception of 
usability issues among developers, we can identify 
four main obstacles that need to be overcome in 
order to successfully integrate HCI practices into 
the overall software development process: UI de- 
sign vs. interaction design, integration with require- 
ments engineering, iterative development, and user- 
centered focus throughout the development pro- 
cess. 

UI Design vs. Interaction Design 

One of the biggest obstacles for HCI-SE integration 
is the existing terminology breach and the disparity 
in the concepts handled. These differences are 
especially noticeable in the denomination of what 
can be considered the main HCI area of expertise: 
UI design. As it is understood in the HCI field, UI 
design represents a wider concept than in SE termi- 
nology. 
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SE refers by UI design to just the design of the 
concrete visual elements that will form the UI and its 
response behavior (in visual terms). It does not 
include any activity related to requirements engi- 
neering. On top of this, there is a widely-accepted 
principle in SE stating that the part of the system that 
manages the visual elements of the UI should be 
separated from the business logic (the internal part 
of the system). The strict application of this principle 
results in a UI design that is not directly related to the 
design of the internal system processes. On the 
graphical side, UI design is produced by graphic 
designers, whose work is governed by aesthetic 
principles. It is this conception of UI design that 
makes SE regard it as part of a related discipline, not 
as one of the core activities that matter most for any 
software development project. 

On the other hand, HCI literature uses the term 
to represent a broader set of activities. Most HCI 
methods label themselves as methods for the design 
of the UI (Hix & Hartson, 1993; Lim & Long, 1994; 
Mayhew, 1999), while including activities that are 
outside the scope of UI design in SE terms (like user 
and task analysis). 

With the aim of a successful integration, we 
suggest the use of a different term for what HCI 
considers UI design in order to raise SE receptive- 
ness to the integration efforts. Specifically, we 
propose the term interaction design, meaning the 
coordination of information exchange between the 
user and the system (Ferre, Juristo, Windl, & 
Constantine, 2001). Software engineers may then 
understand that usability is not just related to the 
visible part of the UI, since activities that study the 
best suited system conception, user needs and ex- 
pectations and the way tasks should be performed 
need to be undertaken to perform interaction design. 
All these additional issues belong to the require- 
ments engineering subfield of SE, as detailed in the 
next subsection. 

Integration with Requirements 
Engineering 

Some integration proposals considered in the previ- 
ous section are based on two development pro- 
cesses carried out in parallel: the interaction design 
process (following an HCI method) and the process 
that develops the rest of the system (a SE process). 



The underlying hypothesis is that the issues with 
which each process deals are not directly related, 
that is, some coordination between the two pro- 
cesses is needed, but they may be basically carried 
out separately. 

Nevertheless, looking at the key tasks for final 
product usability, like user and task analysis, user 
observation in their usual environment, needs analy- 
sis and the development of a product concept that 
can better support such needs, we find that they are 
all activities that, to a lesser or greater extent, have 
been traditionally carried out within the framework 
of requirements engineering, a SE subdiscipline. 
HCI can provide a user-centered perspective to 
assure that these activities are performed in the 
software process with positive results regarding the 
usability of the final software product, emphasizing 
this perspective throughout the development pro- 
cess. In short, there is a big enough overlap between 
the two disciplines to call for a tight integration of 
activities and techniques from each discipline. 

Some HCI authors, like Mayhew (1999), defend 
that requirements-related activities should be per- 
formed by HCI experts instead of software engi- 
neers and that the rest of the development process 
should build upon the HCI experts’ work. This 
approach may be valid in organizations where us- 
ability is the main (or only) quality attribute to be 
aimed for and where a usability department is one of 
the leading departments in the organization. For 
other organizations that are not so committed to 
usability, it is not worth their while to completely 
abandon their way of performing requirements en- 
gineering (a recognized cornerstone of any software 
development project in SE) when the only gain is an 
improvement in just one quality attribute (usability) 
of the resulting product. We take the view that HCI 
experts may work together with software engineers, 
and the user-centered flavor of HCI techniques 
may greatly enrich requirements engineering tech- 
niques, but the complete substitution of SE for HCI 
practices in this area is not acceptable from a SE 
point of view. Additionally, the big overlap between 
HCI and SE regarding the requirements-related 
issues makes the approach of undertaking two sepa- 
rate processes (HCI and SE) communicating through 
specific channels ineffective, because performing 
SE activities without a user-centered focus could 
invalidate the results from a usability point of view. 
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Iterative Development 

An iterative approach to software development is 
one of the basic principles of user-centered develop- 
ment according to HCI literature (Constantine & 
Lockwood, 1999; Hix & Hartson, 1993; ISO, 1999; 
Nielsen, 1993; Preece, Rogers, Sharp, Benyon, Hol- 
land, & Carey, 1994; Shneiderman, 1998). The 
complexity of the human side in human-computer 
interaction makes it almost impossible to create a 
correct design at the first go. 

On one hand, SE literature has gradually come to 
accept that an iterative as opposed to a waterfall life 
cycle approach is the best for medium to high 
complexity problems when the development team 
does not have in-depth domain knowledge. Never- 
theless, a waterfall mindset is still deeply rooted in 
day-to-day practice among software developers. 
The reason for this is that the waterfall is a very 
attractive software development model from a struc- 
tural viewpoint, because it gives the illusion of order 
and simplicity within such a complex activity (soft- 
ware systems development). Therefore, although 
SE acknowledges the virtues of the iterative ap- 
proach, which would appear to facilitate the integra- 
tion of a user-centered perspective, this approach is 
not usually applied in practice, which is a major 
deterrent. Additionally, as mentioned earlier, a com- 
mon mistake in the efforts for integrating HCI 
practices into software development has been to use 
the waterfall life cycle as a starting point. W e defend 
that a truly iterative approach should he high- 
lighted as one of greatest possible contributions 
of HCI practice to overall software development, 
as it has been part of its core practices for a long 
time. 

A User-Centered Perspective 
throughout Development 

When usability activities are performed indepen- 
dently from the rest of development activities, there 
is a risk of losing the user-centered perspective 
somewhere along the way. This perspective under- 
lies the entire development process in HCI, since it 
is necessary for producing a usable software sys- 
tem. Therefore, a user-centered perspective needs 
to be conveyed to the developers that are to 
undertake all the activities that are not strictly 



usability-related. This will ensure that usability is 
considered throughout the development process, as 
other quality attributes (like, for example, reliability) 
are. 

When because of the specific circumstances of 
an existing software development organization, it is 
impossible or undesirable to hire a lot of HCI experts 
to apply the HCI techniques, developers will need to 
apply some of the techniques themselves. Indeed, 
we think that some common HCI techniques, like 
card sorting, user modeling or navigation design, 
could be undertaken by the software engineering 
team, provided that they receive adequate usability 
training. Some other HCI techniques that require a 
lot of usability expertise would still need to be applied 
by HCI experts. 

FUTURE TRENDS 

As a discipline, SE is pervasive in software develop- 
ment organizations all over the world. Its concepts 
are the ones with which the majority of developers 
are familiar, and this is especially true of senior 
management at software development organiza- 
tions. HCI, on the other hand, has been traditionally 
considered as a specialist field, but there is an 
increasing demand within the SE field for effective 
integration of its activities and techniques into the 
overall software process. Therefore, the trend is 
towards an effective integration of HCI techniques 
and activities into SE development practices. Teams 
will include usability specialists, and software engi- 
neers will acquire the basic usability concepts in 
order to improve team communication. Some soft- 
ware engineers may even be able to apply some HCI 
techniques. 

For multidisciplinary teams with a SE leadership 
to be workable, the terminology breach needs to be 
surmounted. For this purpose, we suggest that a 
usability roadmap aimed at software developers be 
drawn up. This roadmap would serve as a toolbox for 
software engineers who want to include HCI prac- 
tices in the development process currently applied at 
their software development organizations. It should 
then be expressed according to SE terminology and 
concepts, and it should include information for each 
HCI technique about what kind of activity it is 
applied for and about when in an iterative process its 



426 



Obstacles for the Integration of HCI Practices into Software Engineering Development Processes 



application most contributes to the usability of the 
final software product. Software developers may 
then manage usability activities and techniques along 
with SE ones. The only requirement for the existing 
development process would be that it should be truly 
iterative, since a waterfall approach would make 
any introduction of usability techniques almost irrel- 
evant. 



CONCLUSION 

HCI and SE take different but complementary views 
of software development. Both have been applied 
separately in most projects to the date, but overlap- 
ping areas between both disciplines have been iden- 
tified and the software development field is claiming 
a tighter integration of HCI aspects into SE develop- 
ment processes. 

Existing integration proposals suffer important 
shortcomings, such as not being truly iterative or 
advocating a separate HCI process. There is a 
terminology breach between SE and HCI, apparent 
in the denomination of HCI’s main concern, UI 
design, which could be expressed as interaction 
design to assure better communication with soft- 
ware engineers. Additionally, it is crucial for usabil- 
ity to be present throughout the whole development 
process in order to maintain a proper user-centered 
focus. 

A usability roadmap expressed using SE termi- 
nology and concepts may help software developers 
to overcome these obstacles and to perform a suc- 
cessful integration of HCI aspects into the software 
development process. 
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KEY TERMS 

Interaction Design: The coordination of infor- 
mation exchange between the user and the system. 

Iterative Development: An approach to soft- 
ware development where the overall life cycle is 
composed of several iterations in sequence. 

Requirements Engineering: The systematic 
handling of requirements. 

Software Engineering: The application of a 
systematic, disciplined, quantifiable approach to the 
development, operation, and maintenance of soft- 
ware. 

Software Process: The development roadmap 
followed by an organization to produce software 
systems, that is, the series of activities undertaken to 
develop and maintain software systems. 

Software Requirements: An expression of the 
needs and constraints that are placed upon a soft- 
ware product that contribute to the satisfaction of 
some real world application. 

User-Centered Development: An approach 
to software development that advocates maintaining 
a continuous user focus during development, with 
the aim of producing a software system with a good 
usability level. 
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INTRODUCTION 

The reader is no doubt well aware of HCI’s empha- 
sis on the analysis of systems in which the computer 
plays the role of tool. The field encompasses positiv- 
ist and pragmatic approaches in analyzing the prod- 
ucts and the trajectories of use of technology (Coyne, 
1995; Ihde, 2002; Preece et al., 1994), and many 
useful guidelines for the design of task-oriented tools 
have been produced as a result. However, use value 
and efficiency increasingly are leaving consumers 
cold; society has always needed things other than 
tools, and expectations of personal digital products 
are changing. Once utilitarian, they are now ap- 
proached as experience, and Pat Jordan, for ex- 
ample, has successfully plotted the progression from 
functionality to usability to pleasure (Jordan, 2000). 
A precedent set by the Doors of Perception commu- 
nity (van Hinte, 1997) has seen slow social move- 
ments becoming more prevalent, design symposia 
dedicated to emotion, and traditional market re- 
search challenged by the suggestion that the new 
consumer values something other than speed and 
work ethics. This search for authenticity appears to 
be resistive to demographic methodologies (Boyle, 
2003; Brand, 2000; Lewis & Bridger, 2000) yet 
underpins important new approaches to sustainable 
consumption (Brand, 2000; Bunnell, 2002; 
Csikzsentmihalyi & Rochberg-Halton, 1981; Fuad- 
Luke, 2002; van Hinte, 1997). The next section 
introduces pragmatic and critical approaches to HCI 
before examining the importance of the artwork as 
authentic experience. 

BACKGROUND 

Pragmatism 

HCI ’ s activity revolves around tools. Its philosophi- 
cal framework traditionally has been one of useful- 



ness, demonstrated in terms of the workplace; it can 
show “tangible benefits that can be talked of in cash 
terms ... providing clear cut examples of case 
studies where ... costs have been reduced, work 
levels improved, and absenteeism reduced” (Preece 
et al., 1994, p. 19). Winograd and Flores (1986) 
defined the scope of their investigation as being 
primarily “what people do in their work” and saw the 
issues arising from this study to be pertinent to 
“home-life as well” (p. 143). Interaction design is 
focused similarly on optimizing the efficiency of the 
tool: “Users want a site that is easy to use, that has 
a minimum of download time, and that allows them 
to complete their tasks in a minimal amount of time 
with a minimal amount of frustration” (Lazar, 200 1 , 
p. 3). Both disciplines are increasingly taking into 
account the social situation of communities of users, 
and the constitutive nature of technology itself; that 
is, it is understood that the introduction of a technol- 
ogy into society often is merely the beginning rather 
than the culmination of the cycle of appropriation. It 
is this socially constitutive aspect of technology that 
requires HCI to embrace not only pragmatism but 
also critical design practices. 

A Critical View 

A critical stance questions the role of technology 
with respect to social and political structures and 
inquires into the future of humankind in light of its 
appropriation. Design carries with it the ethical 
implications of its impact on communities, no matter 
that trajectories cannot be predetermined: “Design 
. . . imposes the interests of a few on the many” and 
is “apolitical activity” (Coyne, 1995, pp. 11-12). It 
raises questions about human activity in meaning 
making in contrast with passivity. McCarthy & 
Wright (2003), for example, evoke the Apple Mac as 
an “object to be with” but go on to ask whether we 
“passively consume this message or complete the 
experience ourselves” (p. 88). The situation at the 



Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited. 



On Not Designing Tools 



moment is such that any experience of computers 
that throws into relief the nature of the computer 
itself is critical in nature. In challenging pragmatism, 
the critical position raises questions about the need 
for socially grounded performative meaning making 
and about how truth is often seen to be embodied and 
presented by the technological reasoning of the 
machine. In practical terms, pragmatism in interac- 
tion design is characterized by an emphasis on the 
transparent interface, as championed by Winograd 
& Flores (1986) in the early visions for ubiquitous 
computing (Weiser, 1991) and by cognitive psy- 
chologist Donald Norman ( 1 999) ; the critical nature 
of the artwork for HCI lies in its re-phy sicalization of 
technology. This physicality , or obstinacy, is depen- 
dent on the user’s awareness of the interface in 
interaction, which platonic design seeks to minimize 
if not erase. 

Phenomenology: Disappearance and 
Obstinacy 

The notion of the tool is challenged by awareness; 
tools by definition disappear (Baudrillard, 1968; 
Fleidegger, 1962). The phenomenologically invisible 
interface was described first by Winograd and Flores 
(1986) in their seminal book, Understanding Com- 
puters and Cognition , further elucidated by Steve 
Weiser (1991) in his visions for the paradigm of 
ubiquitous computing, and finally popularized by 
Donald Norman’s (1999) The Disappearing Com- 
puter. These texts take as a starting point a phenom- 
enological view of action in the world; that is, as soon 
as tools become conspicuous, to use a Heideggerian 
term, they are no longer available for the specific 
task in mind (or ready-to-hand), instead becoming 
obstinate and obtrusive (present-at-hand). As long 
as we approach a tool with the goal of using it for a 
specific task, such obstinacy will remain negative, 
but there does exist a different class of artifact 
where it becomes interesting, positive, and even 
necessary for the existence of the artifact in the first 
place. Objection-able might be an alternative to 
Heidegger’s terminology, embodying the idea of a 
thing regaining its materiality, that existence that is 
dependent on performative human perception. 
Baudrillard (19698) talks about objects as being non- 
tools, about their being ready for appreciation, part 
of a value system created through appreciation. 



They are the noticed artifacts in our lives, and as 
such are positioned to accrue the personal meaning 
that underlies truly authentic experience. This ap- 
proach takes the concept of phenomenological dis- 
appearance and shifts the focus from the transpar- 
ent interface to that of the visible object. The differ- 
ence lies in the location of breakdown and in its 
recasting as an essentially positive part of experi- 
ence. Winograd and Flores (1986) point out that 
meaning arises out of “how we talk about the world,” 
emerging in “recurrent patterns of breakdown and 
the potential for discourse about grounding” (p. 68); 
in the design of transparent, seamless experiences, 
breakdown is something to be prepared against 
rather than encouraged. The authors apply 
Heidegger’s readiness-to-hand to the design of sys- 
tems that support problem solving in the face of 
inevitable and undesirable breakdown situations. 
This article presents a case for an alternative appli- 
cation of an understanding of the same phenomeno- 
logical concepts towards the production of visible, 
objection-able artifacts. The following section intro- 
duces art as process and product, defined by objec- 
tion-ability, and examines the human need for art in 
light of this quality. 

ART 

Philosophical Importance of Art 

Art objects are those that are created expressly to 
spark cognition through a combination of order and 
disorder, through continuity and discontinuity 
(Pepperell, 2002). New languages are formed in 
expressing aspects of being in new ways. The 
artifact acts as a medium for expression (even if the 
intent of the artist is to erase authorship); but it is in 
the gap for open subjective reading, in the active 
articulation of pre-linguistic apprehension, that mean- 
ing is co-created (Eldridge, 2003). Thus, to conceive 
of a meaningful digital product is to intentionally 
invert the paradigm of the invisible computer. If we 
are designing products to become meaningful ob- 
jects, then in order to trigger that articulation, we 
must introduce discontinuity, even Heideggerian 
breakdown, opening up the space for the inter- 
subjective co-production of meaning (Baudrillard, 
1968; Eco, 1989; Greenhalgh, 2002; Heidegger, 1962; 
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Ihde, 2002; Pepperell, 2002). In the use of denotative 
symbolism, this space is increasingly closed to the 
reader. Even in more contextual design practice 
where connotative meaning is taken into account, the 
end goal remains one of seamlessness in context. It 
should be useful instead to approach the design 
process as an attempt to build coherent new vocabu- 
laries with computational materials in the manner of 
artists. This challenges the notion of design patterns 
in particular, which limit subjective reading; artists, in 
contrast, embark on an “obsessive, intense search” 
(Greenhalgh, 2002, p. 7), “working through the sub- 
ject matter” of their emotions in order to objectify 
subjective impulses” (Eldridge, 2003, p. 70). The goal 
of the artwork is not transparency but reflection, and 
Bolter and Gromala (2003) show us how digital arts 
practice illustrates this. In their reflection, digital 
artworks become material in interaction, questioning 
the very technologies they depend upon for their 
existence . It is proposed that we now seek to comple- 
ment the current paradigms of computing through the 
conscious use of computational materials to create 
new expressions, new objects, that are not efficient 
feature-rich tools but that instead may play a differ- 
ent, rich social role for human beings. 

MeAoW (MEDIA ART 
OR WHATEVER) 

The Center for Advanced Technology at New York 
University initiated a lecture series named The CA T ’s 
MeAoW to “facilitate artists’ engagement with tech- 
nologies and technologists” (Mitchell et al., 2003, p. 
156); the title was chosen intentionally to reflect the 
lack of consensus on terminology pervading the field. 
Artists always have been involved and often instru- 
mental in the development of technology, using it 
toward their own expressive ends and asking differ- 
ent sorts of questions through it to those of the 
scientists. The myriad uses of the computer in art 
reflect its plasticity, creating various interconnected 
fields of artistic endeavor, including graphics, video, 
music, interactive art, and practice making use of 
immersive technology, embedded and wearable sys- 
tems, and tangible computing (see, for example, the 
work of Thecla Schiphorst and Susan Kozel on the 
Whisper project, and Hiroshi Ishii and the work of the 
Tangible computing group at MIT’s Media Lab). 
Issues of temporality, perception, authorship, and 



surveillance continue to engage artists using these 
media as well as hardware, coding, and output as 
expressive materials in their own right. Steve Mann’ s 
work with wearable systems stems from his days as 
a photographer and interests in issues of surveil- 
lance — his wearable computing devices give power 
back to the user in their ability to survey for them- 
selves. Major communities of practice at the inter- 
sections of art and technology can be found cen- 
tered on organizations such as SIGGRAPH (since 
the mid-1960s), festivals such as Ars Electronica 
(since 1979), and MIT’s journal, Leonardo (since 
1968). The human-computer interaction commu- 
nity, in contrast, has been served by its special 
interest group, SIGCHI, since a relatively recent 
1982. It is only recently that a few HCI researchers 
and practitioners have begun to approach arts prac- 
tices and outcomes as rigorously as those method- 
ologies adopted from the social sciences. Anthony 
Dunne’s (1999) concept of the post-optimal object 
led him to explore methods for the design and 
dissemination of genotypes, artifacts intended not 
as prototypical models for production but as props 
for critical debate; Dunne’s work with Fiona Raby 
went on to place more or less functioning artifacts 
in volunteers’ households (Dunne & Raby, 2002), 
while his work with W illiam Gaver and Elena Pacenti 
resulted in the influential Cultural Probes, a col- 
lection of arts-based methods designed to inform 
the designers in the project in a far less directional 
way than user-centered design methodologies have 
been used to doing (Gaver et al., 1999). 

FUTURE TRENDS 

Arts-based methods of evaluation and production 
are increasing in importance for human-computer 
interaction, which, in turn, indicates a rethinking of 
the end goals of interaction and computational 
product design to take account of the critical. In 
order to deal with theses changes, HCI is having to 
add new transdisciplinary methodologies to comple- 
ment its more comfortable user-centered ap- 
proaches. Noting the temptation for practitioners in 
HCI to bend the cultural probes to more quantitative 
ends, Gaver (2004) has renamed the approach 
probology in an effort to restate its initial aims and 
to reiterate the importance of asking the right type 
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of questions through it. John Haworth’s (2003) Arts 
and Humanities Research Board-funded project, 
Creativity and Embodied Mind in Digital Fine Art 
(2002-2003) , produced critical products for the pub- 
lic realm and was based on an “innovative interlock- 
ing” of methods, including “creative practice and 
reflection, literature and gallery research, inter- 
views with artists, seminar-workshops, and an inter- 
active website,” emphasizing “the importance of 
both pre-reflexive and reflexive thought in guiding 
action” (pp. 1-3) (Candy et al., 2002). Mitchell, et al. 
(2003) extrapolate the broadening range of qualita- 
tive methodologies that HCI is encompassing to 
suggest a future inclusion of non-utilitarian evalua- 
tion techniques more typically employed by artists. 
The editors say these “differ radically from those of 
computer scientists,” making the important point 
that artists “seek to provoke as well as to understand 
the user” (Mitchell et al., 2003, p. 1 1 1). They do not 
underestimate the fundamental rethinking this will 
require of user tests and correctly assert that these 
less formal methods offer more reward in terms of 
understanding “social impact, cultural meaning, and 
the potential political implications of a technology” 
(Mitchell et al., 2003, pp. 111-112). While these 
evaluative methods facilitate a different kind of 
understanding of the technology in context, the 
corresponding arts-based design process, through 
provocation, delivers a different kind of value to the 
user in the first place. Most elegantly, Bolter and 
Gromala (2003) have elucidated the apparent para- 
dox of the visible tool in their concept of a rhythm 
between the transparency made possible by a mas- 
tery of techniques and the reflectivity of the framing 
that gives meaning to its content. They call for a 
greater understanding of the nature of this rhythm in 
use, and this author adds to this the need for its 
connection to the experience of the design process 
itself. 



CONCLUSION 

There are compelling reasons for the presentation of 
an alternative to the digital product as information 
appliance, as recent concern over the status of the 
authentic points to a need for performative meaning 
making rather than passive acceptance of spectacle. 



Art is presented as a model for this process and its 
product, requiring, in turn, an inversion of the pri- 
macy of disappearance over materiality. Drawing 
attention to the object itself means introducing disor- 
der and breakdown necessary for dialogue and the 
socially based co-creation of meaning. It is sug- 
gested that an answer may lie in other design disci- 
plines beyond product design and within the explor- 
atory processes of art. 
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KEY TERMS 

Art: A coherent system or articulate form of 
human communication using elements of expres- 
sion, and the search for new expressions articulating 
the human condition. This can include all forms of 
expression; for example, the visual and plastic arts, 
drama, music, poetry, and literature, and covers both 
process and product. Art may, in its own right, be 
conservative, pragmatic, critical, or radical. 

Authenticity: The agentive participation in mean- 
ing making, as opposed to passive reception. This is 
the only way in which an individual can relate 
incoming information to the context of his or her own 
lifeworld, without which meaning does not exist for 
that person. We often sense the lack of authenticity 
in interaction without necessarily understanding our 
own misgivings. 

Breakdown: A term used by German philoso- 
pher Martin Heidegger, originally with negative con- 
notations to describe any cognitive interruption to a 
smooth interaction, or coping, in a situation. It is in 
breakdown that opportunities for human communi- 
cation arise. 

Critical Stance: Any approach to an accepted 
system that intentionally highlights issues of power 
structures supported by it, often emancipatory in 
nature and always political. 
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Expression: The utterance through any lan- 
guage system of prelinguistic emotion or under- 
standing toward the creation of consensual meaning 
between people. 

Invisible Computer: A computer or computer 
interface that disappears cognitively either through 
user expertise or by direct mapping of the relation- 
ship between interface elements, and the actions 
afforded by them. Other current terms are trans- 
parency and seamlessness and their antonyms, 
reflection and seamfulness. 

Meaning Making: The constant goal of hu- 
mans is to understand the world we find ourselves in. 
Meaning is arrived at continuously through social 
interactions with other individuals. 



Pragmatism: The thoroughly practical view of 
praxis in which theory is not separate from action but 
a component of useful action in its application to a 
certain situation. In HCI, this takes into account the 
hermeneutic nature of product or system develop- 
ment and appropriation. 

Sustainable Consumption: A recent move- 
ment in product design and consumer research on 
the need for a change in our patterns of consump- 
tion. The work cited here focuses particularly on the 
meaningfulness of the products and services we 
consume as integral to this shift in attitude. 

Tool: An artifact used to achieve specific, pre- 
determined goals. Defined by HCI and certain 
branches of philosophy by its disappearance in use. 
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INTRODUCTION 

By looking closely at the term online learning , we 
could arrive at a simple definition, which could be the 
use by students of connected (online) computers to 
participate in educational activities (learning). While 
this definition is technically correct, it fails to explain 
the full range and use of connected computers in the 
classroom. Historically, the term appears to have 
evolved as new information and communication 
tools have been developed and deployed. For ex- 
ample, in the early stages of development, Radford, 
(1997) used the term online learning to denote 
material that was accessible via a computer using 
networks or telecommunications rather than mate- 
rial accessed on paper or other non-networked 
media. Chang and Fisher (1999) described a Web- 
based learning environment as consisting of digitally 
formatted content resources and communication 
devices to allow interaction. Zhu and McKnight 
(2001) described online instruction as any formal 
educational process in which the instruction occurs 
when the learner and the instructor are not in the 
same place and Internet technology is used to pro- 
vide a communication link among the instructor and 
students. Chin and Ng Kon (2003) identified eight 
dimensions that constructed an e-learning frame- 
work. The range of definitions of online learning is 
not only a reflection of technological advancement 
but also a reflection of the variety of ways educa- 
tionalists at all levels use connected computers in 
learning. 

BACKGROUND 

Examples of Online Learning Activities 

In one learning scenario, a group of 10-year-old 
students following a pre-prepared unit in a super- 
vised computer laboratory may use the information 
storage capacity of the World Wide Web (WWW) to 



gather additional resources to prepare a presenta- 
tion on weather patterns. In a second scenario, a 
group of 14-year-olds studying the same topic in a 
classroom with a dedicated computer work station 
situated by the teacher’ s desk could use the commu- 
nicative functions of the Internet to establish mail 
lists with metrological staff to follow studies being 
undertaken on weather patterns in a region. In a third 
scenario, a group of 1 8-year-olds consisting of small 
pockets of learners in isolated locations using home- 
based connected workstations may use an educa- 
tional courseware package, incorporating informa- 
tion storage and communicative functions to partici- 
pate in a complete distance unit, studying impacts 
and implications of climate change. In each of the 
scenarios described, students and teachers have 
used connected computers in distinct ways to achieve 
varied objectives. The technical competencies re- 
quired, the learning support provided, and the physi- 
cal location of the students in each scenario is 
different and distinct. In each scenario, a definable 
learning environment can be identified for each 
group of learners. 

LEVELS OF ONLINE LEARNING 

Educational institutions, from elementary schools to 
universities, are using the WWW and the Internet in 
a variety of ways. For example, institutions may 
establish simple Web sites that provide potential 
students with information on staff roles and respon- 
sibilities; physical resources and layout of the insti- 
tution; past, present, and upcoming events; and a 
range of policy documents. Other institutions may 
use a range of Web-based applications such as e- 
mail, file storage, and exams to make available 
separate course units or entire programs to a global 
market (Bonk, 2001; Bonket al., 1999). To classify 
levels of Web integration that are educational in 
nature, we should look closely at the uses of the Web 
for learning. Online educationalists have identified a 
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number of different forms of online instruction, 
including sharing information on a Web site, commu- 
nicating one-to-one or one-to-many via e-mail, deliv- 
ering library resources via the Internet (e.g., elec- 
tronic databases), or submitting assignments elec- 
tronically (e.g., e-mail attachments, message board 
postings) (Dalziel, 2003; Ho & Tabata, 2001; Rata 
Skudder et al., 2003; Zhu & McKnight, 2001). 
However, the range of possibilities highlighted by 
these educationalists does not fully identify, explain, 
or describe the interactions, the teaching, or the 
learning that occurs within these environments. For 
best practice guidelines to be created for e-environ- 
ments, the common features and activities of the 
Internet or computer-connected courses affecting 
all students, regardless of Web tools used or how 
information is structured and stored, need to be 
identified and described. 



LEARNING ENVIRONMENTS 

In researching and evaluating the success or failure 
of time spent in educational settings, researchers 
could use a number of quantitative measures, such 
as grades allocated or total number of credits earned, 
participation rate in activities, graduation rate, stan- 
dardized test scores, proficiency in subjects, and 
other valued learning outcomes (Dean, 1998; Fraser 
& Fisher, 1994). However, these measures are 
somewhat limited and cannot provide a full picture of 
the education process (Fraser, 1998, 2001). There 
are other measures that can be used that are just as 
effective; for example, student and teacher impres- 
sions of the environment in which they operate are 
vital. The investigation in and of learning environ- 
ments has its roots nourished by the Lewinian for- 
mula, B=/(P,E). This formula identifies that behav- 
ior (B) is considered to be a function of (/), the 
person (P), and the environment (E). It recognizes 
that both the environment and its interaction with 
personal characteristics of the individual are potent 
determinants of human behavior (Fraser, 1998). 

PERCEPTUAL MEASURES 

In the past, it has been common to use pencil and 
paper forms with the administrator supervising data 



entry in learning environment research (Fisher & 
Fraser, 1990; Fraser et al., 1992; Fraser & Walberg, 
1995). Instruments are carefully designed and ask 
students to select an appropriate response from a 
range of options. For example, the Science Labora- 
tory Environment Inventory (SLEI) begins by pro- 
viding students with directions on how to complete 
the questionnaire. They are informed that the form 
is designed to gauge opinion and that there is no right 
or wrong answers. Students are asked to think about 
a statement and draw a circle around a numbered 
response. The range of responses is from 1 to 5, and 
the meaning of each response is explained carefully; 
for example, 1 is that the practice takes place almost 
never, while 5 indicates the practice occurs very 
often (Fraser & Fisher, 1994; Fraser & Tobin, 
1998). Data are analyzed by obtaining a total score 
for a specific scale. This scoring is often completed 
manually. Advancements in computer technologies 
have made it possible to explore the disposal of 
paper-and-pencil instruments and manual data en- 
try. Increasingly, traditional instruments are being 
replaced by electronic versions delivered through 
the Internet (Maor, 2000; Joiner etal., 2002; Walker, 
2002). 

FUTURE TRENDS 
Setting the Scene 

Three connected computer- or WWW-based edu- 
cational activities on the weather were described in 
section one. The first scenario illustrated how the 
information storage and retrieval functions of the 
WWW could be used to expand available student 
resources. In this scenario, students could be super- 
vised directly and assisted in their tasks by a teacher 
responsible for a dedicated computer suite estab- 
lished at the school. The second scenario demon- 
strated how the communication features of con- 
nected computers could be used to provide authentic 
examples to enrich student understanding. In this 
scenario, students could work independently of the 
teacher, who was present, however, to offer guid- 
ance and support. The third scenario described how 
Web-based educational management platforms could 
be used to provide educational opportunities for 
isolated pockets of students. In this scenario, stu- 
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dents are completely independent, and they rely on 
the information and communication technologies pro- 
vided by their tutor for guidance and support. 

On the surface, it would appear that the online 
learning environments created in each of the three 
scenarios are distinct and that no common interac- 
tions or relationships can be identified for investiga- 
tion. For example, tutor-student interactions appear 
to be different. In the first scenario, students are 
guided by the continual physical presence of a tutor 
mentoring progress. In the second scenario, the tutor, 
on occasion, is physically present, offering guidance 
and support. In the third scenario, there is no physical 
relationship established, and the tutor’s interactions 
with students are virtual. Also, the physical environ- 
ment created in each scenario appears to be distinct. 
For example, in the first scenario, all students are 
located physically in a dedicated laboratory. In the 
second scenario, the computer is located in an exist- 
ing teaching space, possibly in a strategic place close 
to the teacher’s desk. The environment in the third 
scenario is dependent on the physical layout of the 
individual student’ s home. 

It could be argued, given these differences, that it 
would not be possible to investigate each environ- 
ment created using a single instrument. However, is 
this really the case? In each of the scenarios de- 
scribed, it is assumed that students have a functional 
knowledge of computer operations. For example, 
there is the assumption that students will be able to: 

• know if the computer is turned on or turned off 

• use a keyboard and computer mouse 

• view information presented on a visual display 
unit and; 

• select and/or use appropriate software applica- 
tions. 

A more complex example focuses on our under- 
standing of the process of learning. As mentioned in 
each of the examples, the students engage with the 
computer, and the tutor facilitates this engagement. 
It can be argued that there is in online environments 
a tutor-student relationship. We then can ask these 
questions: How do these relationships function? Are 
the students satisfied or frustrated by the relation- 
ships created? Does the tutor feel the relationships 
created are beneficial? 



These two examples — tutor-student and stu- 
dent-computer relationships — demonstrate how it 
may be possible to identify and describe common 
features of connected computer and online activi- 
ties. It then can be argued that if it is possible to 
identify and describe these relationships, it is also 
possible to investigate and explore them. It logically 
follows that if we can investigate and explore 
relationships, it is also possible to create best prac- 
tice guidelines for institutions and individuals to 
follow, thereby raising the standard of educational 
activities for all participants. 

Investigation of Relationships in 
Online Learning 

As noted, when reviewing educational activities in 
the online environment, we can immediately raise 
various questions about the nature of teacher- 
student and student-computer interactions. These 
two features have been expanded by Morihara 
(2001) to include student-student interaction, stu- 
dent-media interaction (an expansion to include 
other components rather than simply text) and the 
outcomes of the learning that take place in the 
environment created. Haynes (2002) has refined 
these relationships and identified four relationships 
within online environments that are outlined as 
follows: 

1 . student interface relationship 

2. student-tutor relationships 

3. student-student relationships 

4. student-content relationships 

These four broad areas appear to identify the 
crucial relationships and interactions that occur 
within online environments. However, they do not 
help in clarifying how the student as an individual 
reacts to and reflects on his or her experiences in 
this environment. 

The importance of creating time for and encour- 
aging self-reflection of the learning process is well- 
documented by constructivists (Gilbert, 1993; 
Gunstone, 1994;Hewson, 1996; Posner et al., 1982), 
and it would appear to be crucial to investigate if, 
when, and how this reflection occurs. Therefore, 
there appear to be five broad areas of online learn- 
ing interaction outlined as follows: 
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1. Student-Media Interaction: How are stu- 
dents engaged with digitally stored information, 
and how do they relate to the information 
presented? 

2. Student-Student Relationships: How, why, 
and when dp students communicate with each 
other, and what is the nature of this communi- 
cation? 

3. Student-Tutor Relationships: How, why, 
and when do students communicate with their 
tutor, and what is the nature of this communica- 
tion? 

4. Student-Interface Interaction: What are the 
features of the interface created that enhance/ 
inhibit student learning and navigation? 

5. Student Reflection Activities: How are stu- 
dents encouraged to reflect on their learning, 
are they satisfied with the environment, and 
how do they relate to the environment created? 

These relationships and interactions should form the 
development framework for the identification of 
scales and items to construct an online learning 
survey. Data generated from this instrument should 
guide online learning activities and help to shape 
online interactions. The best-practice guidelines 
generated will serve to raise the standard of online 
educational activities for all participants. 

CONCLUSION 

The growth of connected computing technologies, 
the creation of the Internet, and the introduction of 
the World Wide Web have led to a number of 
educationalists and educational institutions becom- 
ing involved in the development and delivery of 
courses using these technologies. While the range, 
depth, and breadth of potential uses of these tech- 
nologies is vast and forever growing, and while it 
may appear that this divergent use of technologies 
creates a range of different, describable online 
learning environments with little or no commonality, 
it can be argued that there are indeed common 
relationships and interactions. Five relationships and 
interactions have been identified and described in 
this article: Student-Media Interaction, Student-Stu- 
dent Relationships, Student-Tutor Relationships, Stu- 



dent-Interface Interaction, and Student Reflection 
Activities. This article also argued that if relation- 
ships and interactions can be identified and de- 
scribed, it is logical to assume that they can be 
explored and investigated. These investigations ulti- 
mately should lead to the creation of best-practice 
guidelines for online learning. These guidelines then 
could be used by educational institutions and indi- 
viduals to raise the standard of online educational 
activities. 
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KEY TERMS 

Internet: An internet (note the small i) is any set 
of networks interconnected with routers forwarding 
data. The Internet (with a capital I) is the largest 
internet in the world. 

Intranet: A computer network that provides 
services within an organization. 
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Learning Environment: A term used to de- 
scribe the interactions that occur among individuals 
and groups and the setting within which they 
operate. 

Learning Management System: A broad term 
used to describe a wide range of systems that 
organize and provide access to e-learning environ- 
ments for students, tutors, and administrators. 



Online Learning: The use by students of con- 
nected (online) computers to participate in educa- 
tional activities (learning). 

Perceptual Measure: An instrument used to 
investigate identified relationships in learning envi- 
ronments. 

World Wide Web: A virtual space of electronic 
information storage. 
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INTRODUCTION 

Technology-based education is taken as an effective 
tool to support structured learning content dissemi- 
nation within pre-defined learning environments. 
However, effectiveness and efficacy of this para- 
digm relate to how well designers and developers 
address the specificities of users’ learning needs, 
preferences, goals, and priorities taking into account 
their immediate work, social, and personal context. 
This is required in order to focus development 
efforts on the design of e-learning experiences that 
would satisfy identified needs. Thus, studying and 
assessing the human computer interaction side of 
such projects is a critical factor to designing holistic 
and productive e-learning experiences. 

Literature does not show consistent and inte- 
grated findings to support the effectiveness of e- 
learning as a strategic tool to develop knowledge and 
skill acquisition (Rosenberg, 2001; Shih & Gamon, 
2001). The objective of this article is to develop on 
one hand, main identified issues of an integrated 
evaluation framework, focusing on key variables 
from people and technology standpoint within con- 
text of use, and, on the other hand, to summarize the 
relevant tasks involved in designing e-learning expe- 
riences. Main identified issues of an integrated 
evaluation framework include: (i) some relevant 
context-specific factors, and (ii) other issues that 
are identified when people interact with technology. 
Context-specifics factors such as culture, organiza- 
tion of work, management practices, technology, 
and working processes may influence the quality of 
interaction (Laudon & Laudon, 2002) and may also 



help define the organizational readiness to sustain 
the acceptance and evolution of e-learning within 
organizational dynamics. Thus we propose an e- 
learning evaluation framework to be used as a 
diagnostic and managerial tool that can be based on: 
(a) an observed individual vari able, as a visible sign 
of implicit intentions, to support development effort 
during instructional design and initial users’ engage- 
ment, and/or (b) usability and accessibility as key 
identified technology variables addressing accep- 
tance and usage. 

The Background section presents our proposed 
theoretical evaluation framework to guide our analy- 
sis based upon the reviewed li es arising from the 
proposed framework. Last, we elaborate on some 
future work and general conclusion. 

BACKGROUND 

Natural, effective, and also affective interactions 
between humans and computers are still open re- 
search issues due to the complexity and interdepen- 
dency of the dynamic nature of people, technology, 
and their interactions overtime (Baudisch, DeCarlo, 
Duchowski, & Gesiler, 2003; Cohen, Dalrymple, 
Moran, Pereira, & Sullivan, 1989; Gentner & Nielsen, 
1996; Horvtiz & Apacible, 2003; Preece, Rogers, & 
Sharp, 2002). Despite last-decade advancements in 
principles associated to usability design, there is still 
an ever-present need to better understand people- 
technology relationship in their context of use in 
order to design more natural, effective, satisfying 
and enjoyable users’ experiences. Multimodal inter- 
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actions, smart, ambient, and collaborative technolo- 
gies are some current issues that are driving new 
interaction paradigms (Dix, Finlay, Abowd, & Beale, 
1998; Oviatt, 1999). New skills and methods to 
perform work-related tasks at operational and stra- 
tegic levels within organizational dynamics, plus 
societal attitudes, individual lifestyles, priorities, pref- 
erences, physical and cognitive capabilities and lo- 
cations require more innovative approaches to de- 
signing user experiences. In addition, technical and 
users’ feedback coming from different evaluation 
sources require workable methods and tools to cap- 
ture and analyse quantitative and qualitative data in 
a systematic, consistent, integrated, and useful way. 
This situation makes e-learning evaluation process a 
complex one (Garrett, 2004; Janvier & Ghaoui, 
2004; Preece et al., 2002; Rosson & Carroll, 2002). 
Moreover, interpretation of an evaluation outcome 
requires an additional set of skills. Figure 1 shows 
three main aspects to consider when evaluation e- 
learning experiences : ( 1 ) people-related issues (learn- 
ing preferences), (2) instruction-related issues (in- 
struction design), and (3) system-related issues (us- 
ability and accessibility). 

Organizational context and individual learning 
preferences aim at improving people-task fit. This 
means that people’s skills and related learning objec- 
tives are defined by: (a) their preferred ways of 
learning, and (b) the tasks individuals have to per- 
form within the scope of their organizational roles 
and specifics contexts. Principles and practices of 



instructional design and multimodal feasible choices 
are taken into account to structure, organize, and 
present learning content and related tasks (Clark & 
Mayer, 2003). This way, contextual and work- 
relatedness of content is ensured. 

Usability and accessibility, as quality attributes of 
system performance, address the acceptance and 
usage of a system by the intended users. Learning 
outcomes, namely performance and satisfaction af- 
ter being analyzed, would drive initiatives for im- 
provement or new developments at operational and 
strategic levels. These issues are further described 
in the next sections. 

Evaluating People-Related Issues 

From a people standpoint, learning styles are identi- 
fied by researchers, among the multiple individual 
traits that influence learning process, as a key com- 
ponent to design and evaluate effective and satisfac- 
tory instructional methodologies and education-ori- 
ented technologies. Reviewed literature on learning 
styles and individual differences (Atkins, Moore, 
Sharpe, & Hobbs, 2000; Bajraktarevic, Hall, & 
Fullick, 2003; Bernardes & O’Donoghue, 2003; 
Leuthold, 1999; McLaughlin, 1999; Sadler-Smith & 
Riding, 2000; Storey, Phillips, Maczewski, & Wang, 
2002 ; Shih & Gamon, 200 1 ) show that most research 
findings are not conclusive and often contradictory 
regarding the impact of learning styles on outcomes 
of e-learning (McLaughlin, 1999; Shih & Gamon, 



Figure 1. Designing e-learning experiences: People and technology aspects 



Organizational Context 
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2001). Still, many researchers agree that learning 
styles: (a) are a relevant factor to the learning 
experience, and (b) influence learning behaviors likely 
affecting the degree to which individuals will engage 
in particular types of learning (Sadler-Smith & Riding, 
2000). However, the measurement of learning styles 
is complex and time-consuming, because they are 
assessed by using questionnaire or psychometric 
test. Consequently, its usage raises individual’s con- 
cerns about data privacy and protection. To motivate 
a workable approach, we focus our theoretical frame- 
work on learning preferences, which is defined as an 
observable individual trait that shows what tasks or 
objects people favor over others of the same kind 
(McLaughlin, 1999). Hence, learning preferences, in 
our approach, would support the designing of ad- 
equate learning content. 

Learning preferences are revealed through choices 
or actions, and can be validated by using ethno- 
graphic techniques, self-reporting, or log analysis. 
This kind of systematic observation helps also per- 
ceive learning and cognitive strategies in context. 
This way, they can input a gradually evolving intelli- 
gent e-learning system. This knowledge on users’ 
actions and patterns, based on observations and 
inquiries of users, would help development teams to 
understand the patterns of their favoring when com- 
pleting learning tasks and to interact with related 
material across different modalities, media, type of 



learning, context, and other actors involved, namely 
peers, instructors, and support staff. 

Upon this perspective, quality of the interactivity 
within an e-learning environment is defined by: (a) 
the adequacy of learning content to individual learn- 
ing preferences and the quality of service of tech- 
nical conditions, and (b) the quality of relationships 
with humans involved in the learning experience. 
The former depend on the available choices for 
specific user groups’ based upon their preferences 
and technical conditions. The latter depends on 
three specific roles: Instructors, Social Science 
practitioners, and Administrative/Helpdesk staff. 
What each of these roles should involve to ensure 
quality of interactivity? Table 1 summarizes our 
view on the main responsibilities of these three 
roles. 

Evaluating Instruction-Related 
Aspects 

A second relevant aspect in this theoretical frame- 
work is instruction and design (methodology, con- 
tent, and related tasks). This aspect raises key 
issues related to allocation of organizational re- 
sources to learning content creation and updating in 
terms of: (a) matching pedagogically learning con- 
tents and related tasks to suit diverse learning 
needs, preferences, objectives, increasingly evolv- 



Table 1. Relevant roles for ensuring interactivity within e-learning environments 



Role 


Main responsibilities 


Instructor 


(a) Defining, and monitoring, the degree of adequacy 
between users’ learning preferences, method, modalities, and 
media across space and time taking into account local and 
remote specificities of physical and technical contexts of 
learning, 




(b) Reinforcing social aspects within learning communities, 
contributing to habit formation, expected performance, and 
conformance to social norms and practices (Preece et al., 
2002), and 




(c) Being aware, and actively exercise, his or her prominent 
role as members of development teams supporting systems 
by constantly and systematically matching them to user 
groups’ involvement, participation, learning, and capabilities. 


Social Science 
practitioners 


(a) Understanding the dynamic socio-cultural nature of the 
People-System interaction, and 

(b) Defining actions to develop human potential within socio- 
technological contexts across business and academic sectors. 


Administrative/Helpdesk 

staff 


Ensuring quality levels in operational technical support to 
smooth transition phase of a changing management process. 
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Table 2. Instruction-related aspects 

What to evaluate.... 

• To what extent, do learning outcomes relate to business strategies? 

• How well does content structure facilitate internal and external navigation? 

• To what extent, are content organization and graphic layout are effective to 
achieve learning objectives? 

• How well do frequency patterns and learning outcomes justify investment? 

• To what extent, is this way of learning accommodating the identified needs 
of a diverse and disperse population? 

• What are the most cost-effective media and modalities for distributing 
specific content to users in their context of use? 

• What are the most frequent and effective learning tasks across media and 
modalities? 

• How well do learning preferences and learning tasks correlate? 

• How effective is it to use organizational experts as coaches? 



ing education-oriented technology, and business strat- 
egies; and (b) generating structuring, organizing, and 
presenting subject matter or task knowledge within 
task execution’s scope. Interactivity with learning 
content depends on these two issues. Table 2 shows 
some instruction-related items to evaluate. 

Evaluating System-Related Aspects 

We assume the role of an e-learning system as an 
intermediary agent between instructors and learn- 
ers. As an intermediary agent, an e-learning system 
should be designed not only to be effective, efficient, 
but also affective and social. Effectiveness is con- 
cerned with learning objectives, methods, and us- 
ability goals. Efficiency is concerned with measur- 
ing the usage of resources to achieve defined objec- 
tives. Affectivity measures users’ feelings and sat- 
isfaction during e-learning experiences (Dix et al., 
1998; Rosson & Carroll, 2002). Sociality is per- 
ceived as part of a working group (Preece et al., 
2002; Reeves & Nass, 1996). If any of these at- 
tributes are missing, e-learners would not be able to 
engage in the experience and profit from its out- 
comes. Also, quality of the interaction is affected by 
the quality of service supplied by the system and 
related technological and physical infrastructure. 

Regarding quality of service, our e-learning evalu- 
ation framework addresses the technological and 
physical specificities of the experience, such as 
system performance, downloading times, traffic 
flows, access profiling, backups, learning facilities 
and equipments, among others. Table 3 shows some 
items of what to evaluate. 

To holistically evaluate, we do not only evaluate 
usability, but also the social implication of interaction 



on the organizational context and its level of acces- 
sibility. Usability is defined as the extent to which a 
system can be used to achieve defined goals by 
intended users in an effective, efficient, and satis- 
factory manner in a specified context of use (Dix et 
al., 1998). Usability evaluation has been mainly 
based on prototyping, heuristic evaluations, observ- 
ing users, and user testing by using different types of 
methods and tools with a strong quantitative orienta- 
tion. Web-based applications have brought the need 
to cost-effectively evaluate usability among distrib- 
uted applications and by geographically disperse and 
diverse users. Thus, automated usability evaluation 
tools are a promise to achieve cost-effectively us- 
ability goals (Ivory & Hearst, 2002). 

Regarding social aspect of People-System inter- 
action, Agerfalk and Cronholm (2001) stated that 
actability: (a) is “...an information system ’s ability 
to perform actions, and to permit, promote and 
facilitate the performance of actions by users, 
both through the system and based on informa- 
tion from the system, in some business context...”-, 
(b) is a quality metric; (c) focuses on the social 
context of interaction within business context; (d) is 
of a more qualitative nature; and (e) its definition 
reinforces the role of information systems as com- 
munication actors and the need of pre-existing us- 
ers’ knowledge and skills in IT and business tasks. 
The potential implication of this definition is that 
actable information systems are usable systems that 
make explicit the business actions that they can 
support within a specific organizational context. 
Potential complementarities between usability and 
actability are still to be shown. This is still being 
researched (Agerfalk & Cronholm, 200 1 ) , and to our 
knowledge, there is no reliable and valid measuring 
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Table 3. System-related aspects: Quality of service 



What to evaluate.... 

• How well does face-to-face (e.g., live streaming video) or remote 
component (e.g., pre-recorded lectures, courseware, etc.) run in each 
learning session? 

• What, from where, and by whom, are the most frequent learning material 
downloaded? 

• How does the system support interactivity between and within groups 
across time and place? 

• How does the system learn useful and relevant information to achieve 
learning objectives? 

• To what extent, would information system’s outputs change current 

business practices? 



instrument yet. Nevertheless, this is a promising 
area to increase the success rate of information 
system implementation (Xia & Lee, 2004). How- 
ever, its importance as a complement to the quanti- 
tative orientation of usability testing is clear if as- 
suming that interactions take place within specific 
social contexts (communities, groups of people, fami- 
lies, or organizations). Within any group’s context, 
conformance to social rules is required from each of 
the group’s members and strongly affects individual 
and surrounding social dynamics. 

Regarding accessibility, it means that any poten- 
tial users can access contents regardless of their 
cognitive and physical capabilities (Chilsolm, 
Vanderheiden, & Jacobs, 2001). Feasible goals in 
making accessibility a reality is a trade-off between 
flexibility in design, personalization to specific needs, 
usage of assistive technology (Arion & Tutuianu, 
2003; Sloan, Gibson, Milne, & Gregor, 2003), and 
organizational readiness to create and sustain ac- 
cessibility as a strategic issue. 

Assuming that an extended version of the defini- 
tion of usability includes actability, thus this inte- 



grated evaluation would cover efficiency, efficacy, 
satisfaction, and accessibility, in addition to con- 
formance to social norms, legislation, ethical, and 
current business practices. Integrated feedback on 
these key five issues would make users and devel- 
opment teams, (working in a participatory-based 
methodology) fully aware about, and responsible for, 
the impact of pre-defined rules on organizational 
climate and dynamic. In addition, this kind of feed- 
back would indicate areas for improving flexibility in 
design but in a controlled way, namely, giving more 
options closely focused on users’ capabilities and 
their task needs. Table 4 shows some items of what 
to evaluate regarding usability and accessibility. 

MAIN FOCUS OF THE EVALUATION 
FRAMEWORK FOR E-LEARNING 

Main issues associated with three aspects are: (a) e- 
learning personalization mainly in terms of individual 
learning preferences taking other relevant back- 
ground variables as control variables (e.g., goals, 



Table 4. System-related aspects: Usability and accessibility 



What to evaluate.... 

• How well is the system easy to learn and used across user groups in their 
respective context? 

• To what extent, do users perceive the system to contribute easily to their 
interaction with: (a) Content, (b) Peers, (c) Instructors, (d) Support staff? 

• To what extent, is organizational dynamics affected by the use of the 
system? In what user group is the influence most significant? 

• Is the system accessible to potential users regardless of its cognitive and 
physical capabilities? 

• How well do learning results justify investment in achieving usability and 
accessibility goals in terms of learning outcomes across users’ groups? 

• Can the identified communication patterns reinforce dynamically 
organizational values and expected performance level and behaviors? 
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learning priorities, type of learning need, background, 
previous IT experience, etc.) , (b) Coordinating, 
Monitoring, and Controlling the learning process and 
e-learning strategy, (c) Degree of participation of 
users in designing experience, and (d) Integrating 
quantitative and qualitative feedback on effective- 
ness, efficiency, accessibility, satisfaction, and con- 
formance to social context’ s rules to improve human 
computer interaction and its outcomes. Table 5 
shows these issues summarizing investment areas 
and some examples of organizational programs to 
help implementing and ensuring e-learning effec- 
tiveness. This considers two basic scenarios created 
by geographical locations of users participating in an 
e-learning experience. 

It is worth noting two points. First, that IT expe- 
rience or basic skills can be acquired not necessarily 
within organizational settings. Exposure at this level 
should be an orchestrated effort in society and, at a 
political level, to ensure competitiveness. When 
business context does not sustain that, specific orga- 
nizational interventions should be in place to ensure 
engagement and habit formation, such as orientation 
and coaching programs. Second, coordination and 
monitoring of efforts is key to ensure consistent 
methods across different participating geographical 
locations. 

Given the increasing users’ diversity in knowl- 
edge, skill levels, needs, and contexts, we believe 
that applying automated or semi-automated evalua- 



tion tools and ethnographic techniques within devel- 
opment cycle could be cost-effective in improving 
gradually the “intelligence” of e-learning systems. 
Namely, this would help e-learning systems to adapt 
to: (a) the dynamic development of users’ compe- 
tence and formation of habits, expectations, involve- 
ment; and (b) observed choices and actions. This 
could foster a gradual alignment between, on one 
hand, learning outcomes and technology with indi- 
vidual expectations, learning preferences and, on the 
other hand, optimizing allocated organizational re- 
sources within e-learning environments. 

To do so, the development team should have 
additional set of skills and tasks. Based on reviewed 
literature (Agerfalk & Cronholm, 2001; Bernardes 
& O’ Donoghue, 2003 ; Clark & Mayer, 2003 ; Preece 
et al., 2002; Rosenberg, 2001; Rosson & Carroll, 
2002; Sloan et al., 2003) and insights from multi- 
disciplinary practitioners, Table 6 summarizes some 
main tasks and suggested techniques or tools re- 
quired from team’s member. 

FUTURE TRENDS 

Flexible design (rooted in universal principles and 
dedicated design) appears to be continuing orienta- 
tion for designing interfaces during coming years. In 
this sense, development teams should be prepared to 
approach flexibility in design based upon tested, 



Table 5. Relevant context-specific aspects required for designing an e-learning experience 



Context-specific aspects 


1. Organization 


Orientation programs 


Investment in Connectivity, Communication, Vocational Counseling 
and Content 


2. Management 
practices 


(a) Definition of level of investment on skill and system development 
articulated with business strategic objectives; 

(b) Identifying key competencies and strategic options to develop 
them, 

(c) Managing change and involved partners. 

(d) IT skill development 


3. Business 
processes 


Coordination, Controlling, and Monitoring quality of: 

(a) instructional design, 

(b) content production, 

(c) service and 

(d) learning outcomes 


4. Technology 


(a) Usability and Accessibility 

(b ) Monitoring Connectivity, system performance and functionalities 
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Table 6. Development-team members ’ main tasks and suggested evaluation techniques or tools 



Development 
team main 
roles 


Main tasks 


Suggested technique(s) or 
tool(s) 


1. System 
Developer 


(a) Contextual task and user modelling 

(b) Linking business words and actions 
within the e-leaming system 

( c) Defining interaction metaphor and 
scenarios for the different users’ groups 
considering their geographical location 
and time convenience 

(c) Effective, satisfactory and enjoyable 
user experience to speed up learning 
curves 

(d) Matching learning content with 
people’s learning preferences, learning 
tasks, modalities, media and assistive 
technology, if needed 

(e) Ensuring proper flow of information 
among users’ groups 

(f) Monitoring acceptance and usage 
levels 

(g) Administering user profiles 


• Observing users during 
People-System 
interaction 

• Field studies 

• Focus groups 

• Surveys and 
questionnaires 

• Structured interviews 

• Prototyping 

• Heuristic evaluation 

• Usability testing 

• Pluralistic evaluation 

• Content analysis 

• Statistical analysis 

• Log analysis 


2. Information 
architect 


(a) Structuring work-related knowledge 
structures in terms of business language, 

(b) Matching people’s learning 
preferences and presentation of 
information 

(c) Matching modality to structured 
content 


• Technical reports 

• Statistical techniques 

• Content analysis 

• Log analysis 

• Prototyping 

• Heuristics evaluation 

• Usability testing 


3. Content 
manager 


(a) Generating and distributing content 
cost-effective 


• Statistical techniques 

• Log analysis 

• Social Network Analysis 

• Structured interviews 

• Brainstorming 


4. Training- 

process 

manager 


(a) Identification of key skills’ gaps and 
needs, and 

(b) Learning cost-effectiveness 

(c) Supporting expected business 
behaviours and performance levels 

(d) Efficacy of intervention programs 

(e) Improving procedural justice in 
distributing content to proper target 

(f) Monitoring learning efficacy and 
productivity levels, and development of 
required IT skills 


• Model-based evaluation 

• Descriptive statistics 

• Surveys 

• Focus groups 

• Structured interviews 

• Questionnaires 


5. Instructor 


(a) Definition of learning objectives 
regarding identified skill gaps 

(b) Matching instructional design and 
teaching methodology with defined 
learning objectives and users’ learning 
preferences, physical or cognitive 
capabilities, background and previous 
experience 

(c) Structure, organize and present 
learning content regarding users’ needs 
and learning preferences 

(d| Matching communication patterns 
up to students needs 


• Review-based evaluation 

• Descriptive statistics 

• Surveys 

• Focus groups 

• Structured interviews 

• Questionnaires 

• Log analysis 
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Table 6. Development-team members’ main tasks and suggested evaluation techniques or tools (cont.) 



6. Social- 
sciences staff 


(a) Assessment of ergonomic, social and 
cultural impact of technology usage in 
order to minimize health-related 
problems and costs 

(b) Assessing needs regarding cognitive 
and physical capabilities and learning 
preferences to efficiently, accessibly and 
flexibly design for users 

(c) Defining organizational 
interventions or programs such as 
Counselling and Coaching 

(d) Assuring users’ confidence during 
People-System interaction 


• Observing users in 
context 

• Descriptive statistics 

• Surveys 

• Focus groups 

• Storytelling 

• Structured interviews 

• Questionnaires 

• Social Network Analysis 

• Wizard of Oz 


7. 

Administrativ 
e & Helpdesk 
staff 


Administrative and technical diagnosis 
to provide high-quality assistance 


• Opinion polls 

• Questionnaires 


8. 

Rcpresentativ 
e users 


(a) Efficiency and effectiveness in 
performing work-related or learning 
tasks easily 

(b) Higher people-system fit 

(c) Less waste of personal and 
organizational resources 

(d| Eventually, less work-related 
conflict and stress 


• Cognitive walkthrough 

• Role Playing 

• Think aloud protocol 



simplified, and valid evaluation models. Further de- 
velopments may include the following: 

First, flexibility demands the definition of a more 
affective- and socially-oriented heuristics, which 
would require smart tools and techniques to improve 
the quality of the interaction across learning prefer- 
ences. 

Second, flexibility may benefit from having user 
representatives as part of a development team. 
Research work should identity what conditions and 
stages of such involvement could be both more cost- 
effective and ensure better success for e-learning. 
These results could guide human computer interac- 
tion curricula changes to perhaps disseminate prac- 
tices among other related professions. Third, the 
increased use of portable equipments are facilitating 
virtual classroom environment, where people, in any 
place at any time, can participate. Research should 
explore the effectiveness and convenience of differ- 
ent modalities during the learning process’s cycle 
and stages. For instance, it may be efficient to 
consult any information of interest when research- 
ing a new topic of interest by using a portable device 
assistant (PDA) from anywhere. Flowever, devel- 



oping that content requires specific physical condi- 
tions that cannot exist any where any time. Thus 
current and emergent habits across generations of 
people should be explored in terms of convenience 
and learning effectiveness. 

CONCLUSION 

We discussed a holistic framework to evaluate e- 
learning experiences taking into account people, 
technology, and instructional aspects. Learning pref- 
erences, usability, including social and affective 
aspects involved into the human computer interac- 
tion (Bernardes & O’Donoghue, 2003; Laudon & 
Laudon, 2002; McLaughlin, 1999; Picard, 1997; 
Preece et al., 2002; Rentroia-Bonito & Jorge, 2003; 
Rosson & Carroll, 2002) and accessibility (Chilsolm, 
et al., 2001) were identified as a set of key issues to 
evaluate e-learning experiences. The expected re- 
sults of our integrated evaluation approach would 
allow development-team members to achieve a bet- 
ter understanding of how the identified issues affect 
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learning outcomes to identify future improvements 
and developments. 
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KEY TERMS 

Context-Specifics Aspects: Cover the most 
important factors that shaped and become charac- 
teristics of organizational dynamics such as culture, 
business strategies, organization of work, manage- 
ment practices, current technology, workforce com- 
petency level, working processes, among others. 

E-Learning Development Team: The set of 

multi-disciplinary professionals required to develop 
and evaluate an integrated e-learning evaluation. 
Each team should include designers, developers, 
instructors, process managers, social-science staff 
professionals (e.g., psychology, sociology, human 
resources practitioners, and managers, among oth- 
ers), and Helpdesk staff and eventually user repre- 
sentatives of target population. 

E-Learning Evaluation Framework: Com- 
prises an integrated feedback based on people, 
system, and context-specifics aspects. 

Individual Styles (Learning and Cognitive 
Styles): Relate to implicit main individual modes of 
acquiring information, organizing, and processing 
information in memory. They are assessed by using 
questionnaire or psychometric test. 

Learning Preferences: Individual favoring of 
one teaching method over another, which can be 
consistently observed through individual choices or 
actions. 

People Aspects: In this evaluation framework, 
basically covers background, individual learning pref- 
erences, goals, and priorities. 

System Aspects: Cover the technological and 
physical specificities of the e-learning experience at 
server and data layers and the usability and acces- 
sibility issues of the presentation layer of the e- 
learning system. 
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INTRODUCTION 

Desktop multimedia (multimedia personal comput- 
ers) dates from the early 1970s. At that time, the 
enabling force behind multimedia was the emer- 
gence of the new digital technologies in the form 
of digital text, sound, animation, photography, and, 
more recently, video. Nowadays, multimedia sys- 
tems mostly are concerned with the compression 
and transmission of data over networks, large ca- 
pacity and miniaturized storage devices, and quality 
of services; however, what fundamentally charac- 
terizes a multimedia application is that it does not 
understand the data (sound, graphics, video, etc.) 
that it manipulates. In contrast, intelligent multime- 
dia systems at the crossing of the artificial intelli- 
gence and multimedia disciplines gradually have 
gained the ability to understand, interpret, and gen- 
erate data with respect to content. 

Multimodal interfaces are a class of intelligent 
multimedia systems that make use of multiple and 
natural means of communication (modalities), such 
as speech, handwriting, gestures, and gaze, to sup- 
port human-machine interaction. More specifically, 
the term modality describes human perception on 
one of the three following perception channels: 
visual, auditive, and tactile. Multimodality qualifies 
interactions that comprise more than one modality 
on either the input (from the human to the machine) 
or the output (from the machine to the human) and 
the use of more than one device on either side (e.g., 
microphone, camera, display, keyboard, mouse, pen, 
track ball, data glove). Some of the technologies 
used for implementing multimodal interaction come 
from speech processing and computer vision; for 
example, speech recognition, gaze tracking, recog- 
nition of facial expressions and gestures, perception 
of sounds for localization purposes, lip movement 
analysis (to improve speech recognition), and inte- 
gration of speech and gesture information. 



In 1980, the put-that-there system (Bolt, 1980) 
was developed at the Massachusetts Institute of 
Technology and was one of the first multimodal 
systems. In this system, users simultaneously could 
speak and point at a large-screen graphics display 
surface in order to manipulate simple shapes. In the 
1990s, multimodal interfaces started to depart from 
the rather simple speech-and-point paradigm to inte- 
grate more powerful modalities such as pen gestures 
and handwriting input (Vo, 1996) or haptic output. 
Currently, multimodal interfaces have started to 
understand 3D hand gestures, body postures, and 
facial expressions (Ko, 2003), thanks to recent 
progress in computer vision techniques. 

BACKGROUND 

In this section, we briefly review the different types 
of modality combinations, the user benefits brought 
by multimodality, and multimodal software architec- 
tures. 

Combinations of Modalities 

Multimodality does not consist in the mere juxtapo- 
sition of several modalities in the user interface; it 
enables the synergistic use of different combinations 
of modalities. Modality combinations can take sev- 
eral forms (e.g., redundancy and complementarity) 
and fulfill several roles (e.g., disambiguation, sup- 
port, and modulation). 

Two modalities are said to be redundant when 
they convey the same information. Redundancy is 
well illustrated by speech and lip movements. The 
redundancy of signals can be used to increase the 
accuracy of signal recognition and the overall ro- 
bustness of the interaction (Duchnowski, 1994). 

Two modalities are said to be complementary 
when each of them conveys only part of a message 
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blit their integration results in a complete message. 
Complementarity allows for increased flexibility and 
efficiency, because a user can select the modality of 
communication that is the most appropriate for a 
given type of information. 

Mutual disambiguation occurs when the integra- 
tion of ambiguous messages results in the resolution 
of the ambiguity. Let us imagine a user pointing at 
two overlapped figures on a screen, a circle and a 
square, while saying “the square.” The gesture is 
ambiguous because of the overlap of the figures, and 
the speech also may be ambiguous if there is more 
than one square visible on the screen. However, the 
integration of these two signals yields a perfectly 
unambiguous message. 

Support describes the role taken by one modality 
to enhance another modality that is said to be 
dominant; for example, speech often is accompanied 
by hand gestures that simply support the speech 
production and help to smooth the communication 
process. 

Finally, modulation occurs when a message that 
is conveyed by one modality alters the content of a 
message conveyed by another modality. A person’s 
facial expression, for example, can greatly alter the 
meaning of the words he or she pronounces. 

User Benefits 

It is widely recognized that multimodal interfaces, 
when carefully designed and implemented, have the 
potential to greatly improve human-computer inter- 
action, because they can be more intuitive, natural, 
efficient, and robust. 

Flexibility is obtained when users can use the 
modality of their choice, which presupposes that the 
different modalities are equivalent (i.e. , they can 
convey the same information). Increased robust- 

Figure 1. Multimodal software architectures 



ness can result from the integration of redundant, 
complementary, or disambiguating inputs. A good 
example is that of visual speech recognition, where 
audio signals and visual signals are combined to 
increase the accuracy of speech recognition. Natu- 
ralness results from the fact that the types of modali- 
ties implemented are close to the ones used in 
human-human communication (i.e., speech, ges- 
tures, facial expressions, etc.). 

Software Architectures 

In order to enable modality combinations in the user 
interface, adapted software architectures are needed. 
There are two fundamental types of multimodal 
software architectures, depending on the types of 
modalities. In feature level architectures, the inte- 
gration of modalities is performed during the recog- 
nition process, whereas in semantic level architec- 
tures, each modality is processed or recognized 
independently of the others (Figure 1). 

Feature-level architectures generally are consid- 
ered appropriate for tightly related and synchronized 
modalities, such as speech and lip movements 
(Duchnowski et al., 1994). In this type of architec- 
ture, connectionist models can be used for process- 
ing modalities because of their good performance as 
pattern classifiers and because they easily can inte- 
grate heterogeneous features. However, a truly 
multimodal connectionist approach is dependent on 
the availability of multimodal training data, and such 
data currently is not available. 

When the interdependency between modalities 
implies complementarity or disambiguation (e.g., 
speech and gesture inputs), information typically is 
integrated into semantic-level architectures (Nigay 
et al., 1995). In this type of architecture, the main 
approach for modality integration is based on the use 
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of data structures called frames. Frames are used to 
represent meaning and knowledge and to merge 
information that results from different modality 
streams. 

MAIN ISSUES IN MULTIMODAL 
INTERACTION 

Designing Multimodal Interaction 

Recent developments in recognition-based interac- 
tion technologies (e.g., speech and gesture recogni- 
tion) have opened a myriad of new possibilities for 
the design and implementation of multimodal inter- 
faces. Fiowever, designing systems that take advan- 
tage of these new interaction techniques are difficult. 
Our lack of understanding of how different modes of 
interaction can be combined best into the user inter- 
face often leads to interface designs with poor usabil- 
ity. Most studies to understand natural integration of 
communication modes are found in the experimental 
psychology research literature, but they tend to quali- 
tatively describe human-to-human communication 
modes. Very few attempts have been made so far to 
qualitatively or quantitatively describe multimodal 
human-computer interaction (Bourguet, 1998; Nigay, 
1995; Oviatt, 1997). Much more work is still needed 
in this area. 

Implementing Multimodality 

Developers still face major technical challenges for 
the implementation of multimodality, as indeed, the 
multimodal dimension of a user interface raises nu- 
merous challenges that are not present in more 
traditional interfaces (Bourguet, 2004). These chal- 
lenges include the need to process inputs from differ- 
ent and heterogeneous streams; the coordination and 
integration of several communication channels (input 
modalities) that operate in parallel (modality fusion) ; 
the partition of information sets across several output 
modalities for the generation of efficient multimodal 
presentations (modality fission); dealing with uncer- 
tainty and recognition errors; and implementing dis- 
tributed interfaces over networks (e.g., when speech 
and gesture recognition are performed on different 



processors). There is a general lack of appropriate 
tools to guide the design and implementation of 
multimodal interfaces. 

Bourguet (2003a, 2003b) has proposed a simple 
framework, based on the finite state machine for- 
malism, for describing multimodal interaction de- 
signs and for combining sets of user inputs of 
different modalities. The proposed framework can 
help designers in reasoning about synchronization 
patterns problems and testing interaction robust- 
ness. 

Uncertainty in Multimodal Interfaces 

Natural modalities of interaction, such as speech 
and gestures, typically rely on recognition-based 
technologies that are inherently error prone. Speech 
recognition systems, for example, are sensitive to 
vocabulary size, quality of audio signal, and variabil- 
ity of voice parameters (Flalverson, 1999). Signal 
and noise separation also remains a major challenge 
in speech recognition technology, as current sys- 
tems are extremely sensitive to background noise 
and to the presence of more than one speaker. In 
addition, slight changes in voice quality (due, for 
example, to the speaker having a cold) can signifi- 
cantly affect the performance of a recognizer, even 
after the user has trained it. 

Several possible user strategies to prevent or 
correct recognition errors have been uncovered. 
Oviatt (2000) shows that in order to avoid recogni- 
tion errors, users tend to spontaneously select the 
input mode they recognize as being the most robust 
for a certain type of content (modality selection 
strategy). When recognition errors occurr, Suhm 
(2001) suggests that users be willing to repeat their 
input at least once, after which they will tend to 
switch to another modality (modality switching strat- 
egy). Finally, Oviat (2000) reports cases of linguis- 
tic adaptation, where users choose to reformulate 
their speech in the belief that it can influence error 
resolution — a word may be substituted for another, 
or a simpler syntactic structure may be chosen. 
Overall, much more research is still needed to 
increase the robustness of recognition-based mo- 
dalities. 
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APPLICATIONS 

Two applications of multimodal interaction are de- 
scribed. 

Augmented Reality 

Augmented reality is a new form of multimodal 
interface in which the user interacts with real-world 
objects and, at the same time, is given supplementary 
visual information about these objects (e.g., via a 
head mounted display). This supplementary infor- 
mation is context-dependent (i.e. , it is drawn from 
the real objects and fitted to them). The virtual world 
is intended to complement the real world on which it 
is overlaid. Augmented reality makes use of the 
latest computer vision techniques and sensor tech- 
nologies, cameras, and head-mounted displays. It 
has been demonstrated, for example, in a prototype 
to enhance medical surgery (Dubois, 1999). 

Tangible Interfaces 

People are good at sensing and manipulating physi- 
cal objects, but these skills seldom are used in 
human-computer interaction. Tangible interfaces 
are multimodal interfaces that exploit the tactile 
modalities by giving physical form to digital informa- 
tion (Ishii, 1997). They implement physical objects, 
surfaces, and textures as tangible embodiments of 
digital information. The tangible query interface, for 
example, proposes a new means for querying rela- 
tional databases through the manipulation of physi- 
cal tokens on a series of sliding racks. 

FUTURE TRENDS 
Ubiquitous Computing 

Ubiquitous computing describes a world from which 
the personal computer has disappeared and has been 
replaced by a multitude of wireless, small computing 
devices embodied in everyday objects (e.g., watches, 
clothes, or refrigerators). The emergence of these 
new devices has brought new challenges for human- 
computer interaction. A fundamentally new class of 
modalities has emerged — the so-called passive mo- 



dalities — that corresponds to information that is 
automatically captured by the multimodal interface 
without any voluntary action from the user. Passive 
modalities complement the active modalities such as 
voice command or pen gestures. 

Compared with desktop computers, the screens 
of ubiquitous computing devices are small or non- 
existent; small keyboards and touch panels are hard 
to use when on the move, and processing powers are 
limited. In response to this interaction challenge, 
new modalities of interaction (e.g., non-speech 
sounds) (Brewster, 1998) have been proposed, and 
the multimodal interaction research community has 
started to adapt traditional multimodal interaction 
techniques to the constraints of ubiquitous comput- 
ing devices (Branco, 200 1 ; Schaefer, 2003 ; Schneider, 
2001). 

CONCLUSION 

Multimodal interfaces are a class of intelligent mul- 
timedia systems that extends the sensory-motor 
capabilities of computer systems to better match the 
natural communication means of human beings. As 
recognition-based technologies such as speech rec- 
ognition and computer vision techniques continue to 
improve, multimodal interaction should become wide- 
spread and eventually may replace traditional styles 
of human-computer interaction (e.g., keyboard and 
mice). However, much research still is needed to 
better understand users’ multimodal behaviors in 
order to help designers and developers to build 
natural and robust multimodal interfaces . In particu- 
lar, ubiquitous computing is a new important trend in 
computing that will necessitate the design of innova- 
tive and robust multimodal interfaces that will allow 
users to interact naturally with a multitude of embed- 
ded and invisible computing devices. 
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KEY TERMS 

Active Modality: Modality voluntarily and con- 
sciously used by users to issue a command to the 
computer; for example, a voice command or a pen 
gesture. 

Feature-Level Architecture: In this type of 
architecture, modality fusion operates at a low level 
of modality processing. The recognition process in 
one modality can influence the recognition process 
in another modality. Feature-level architectures gen- 
erally are considered appropriate for tightly related 
and synchronized modalities, such as speech and lip 
movements. 

Haptic Output: Devices that produce a tactile 
or force output. Nearly all devices with tactile output 
have been developed for graphical or robotic appli- 
cations. 

Modality Fission: The partition of information 
sets across several modality outputs for the genera- 
tion of efficient multimodal presentations. 



Modality Fusion: Integration of several modal- 
ity inputs in the multimodal architecture to recon- 
struct a user’s command. 

Mutual Disambiguation: The phenomenon in 
which an input signal in one modality allows recov- 
ery from recognition error or ambiguity in a second 
signal in a different modality is called mutual disam- 
biguation of input modes. 

Passive Modality: Information that is captured 
automatically by the multimodal interface; for ex- 
ample, to track a user’s location via a microphone, a 
camera, or data sensors. 

Semantic-Level Architecture: In semantic 
level architectures, modalities are integrated at higher 
levels of processing. Speech and gestures, for ex- 
ample, are recognized in parallel and independently. 
The results are stored in meaning representations 
that then are fused by the multimodal integration 
component. 

Visual Speech Recognition: Computer vision 
techniques are used to extract information about the 
lips’ shape. This information is compared with infor- 
mation extracted from the speech acoustic signal to 
determine the most probable speech recognition 
output. 
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INTRODUCTION 

Telemedicine is broadly defined as the use of infor- 
mation and communications technology to provide 
medical information and services (Perednia & Allen, 

1995) . Telemedicine offers an unprecedented means 
of bringing healthcare to anyone regardless of geo- 
graphic remoteness. It promotes the use of ICT for 
healthcare when physical distance separates the 
provider from the patient (Institute of Medicine, 

1996) . In addition, it provides for real-time feedback, 
thus eliminating the waiting time associated with a 
traditional healthcare visit. 

Telemedicine has been pursued for over three 
decades as researchers, healthcare providers, and 
clinicians search for a way to reach patients living in 
remote and isolated areas (Norris, 2001). Early 
implementation of telemedicine made use of the 
telephone in order for healthcare providers and 
patients to interact. Over time, fax machines were 
introduced along with interactive multimedia, thus 
supporting teleconferencing among participants. 
Unfortunately, many of the early telemedicine 
projects did not survive because of high costs and 
insurmountable barriers associated with the use of 
technology. 

Telemedicine has been resurrected during the 
last decade as a means to help rural healthcare 
facilities. Advances in information and communica- 
tions technology have initiated partnerships between 
rural healthcare facilities and larger ones. The Internet 
in particular has changed the way in which medical 
consultations can be provided (Coiera, 1997). Per- 
sonal computers (PCs) and supporting peripherals, 
acting as clients, can be linked to medical databases 
residing virtually in any geographic space. Multime- 
dia data types, video, audio, text, imaging, and graph- 
ics promote the rapid diagnosis and treatment of 
casualties and diseases. 

Innovations in ICT offer unprecedented healthcare 
opportunities in remote regions throughout the world. 
Mobile devices using wireless connectivity are grow- 



ing in popularity as thin clients that can be linked to 
centralized or distributed medical-data sources. These 
devices provide for local data storage of medical 
data, which can be retrieved and sent back to a 
centralized source when Internet access becomes 
available. Those working in nomadic environments 
are connected to data sources that in the past were 
inaccessible due to a lack of telephone and cable 
lines. For the military, paramedics, social workers, 
and other healthcare providers in the field, ICT 
advances have removed technology barriers that 
made mobility difficult if not impossible. 

Personal digital assistants (PDAs) 1 are mobile 
devices that continue to grow in popularity. PDAs 
are typically considered more usable for multimedia 
data than smaller wireless devices (e.g., cell phones) 
because of larger screens, fully functional key- 
boards, and operating systems that support many 
desktop features. Over the past several years, PDAs 
have become far less costly than personal-comput- 
ing technology. They are portable, lightweight, and 
mobile when compared to desktop computers. Yet, 
they offer similar functionality scaled back to ac- 
commodate the differences in user-interface de- 
signs, data transmission speed, memory, processing 
power, data storage capacity, and battery life. 

BACKGROUND 

Computing experts predicted that PDAs would sup- 
plant the personal computer as ubiquitous technol- 
ogy (Chen, 1999; Weiser as cited in Kim & Albers, 
2001). Though this has not yet happened, PDA 
usage continues to grow with advances in operating 
systems, database technology, and add-on features 
such as digital cameras. They are being used in 
sales, field engineering, education, healthcare, and 
other areas that require mobility. In the medical field, 
for example, they are being used to record and track 
patient data (Du Bois & McCright, 2000). This 
mobility is made possible by enterprise servers push- 
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Table 1. User-interface design constraints for PDA devices (Paelke, Reimann, & Rosenbach, 2003) 



Limited resolution 


Typical resolution of a PDA is low (240*320 pixels). This impacts the 
visibility of content, objects and images. 


Small display size 


The small screen size of a PDA limits the number of objects and the 
amount of text on a screen page. This limitation impacts design layout 
in terms of font size, white space, links, text, images, and graphics, 
among others. 


Navigational structure 


Navigation is impacted by the increased number of screen pages 
required to accommodate text and objects that on a desktop or laptop 
would fit on one screen page. Design choices include a long page with 
a flat navigation hierarchy versus the design of multiple short pages 
with a deeper navigational hierarchy. 


Limited use of color 


A PDA uses a gray scale or a color palette limited to several thousand 
color choices (compared to millions of color choices for desktop 
applications). Readability and comprehension may be impacted when 
color is used to relay information or color combinations are 
insufficient in contrast. 


Limited processing power 


Limited processing power impacts the quality of graphical displays 
and imaging. It also restricts the use of interactive real-time 
animation. 


Mouse is replaced with 
stylus pen 


A PDA does not use a mouse, which has become a standard peripheral 
in a desktop environment. As a result, there is a learning curve 
associated with the use of a stylus pen, which replaces mouse 
functionality. 


Small keyboard size 


The PDA keyboard size and layout impacts data entry. As a result, it 
is more difficult for users to entered lengthy and complex medical data 
in a real-time environment. 



ing data onto these devices without user interven- 
tion. Enterprise servers are also capable of pulling 
data from a localized (PDA) database such that 
centralized data sources are readily updated. 

A PDA synchronizes with laptops and desktop 
computers, making data sharing transparent. This is 
made possible by a user interface and functionality 
that are compatible in terms of computing capabili- 
ties and input and output devices (Myers, 2001). 
Compatibility is a major issue in telemedicine given 
that medical and patient data gathered or stored on 
a PDA is typically sent to a centralized data source. 
Nomadic use of PDAs mandates this type of data 
integration whether it is real-time or batched data 
when wireless connectivity is temporarily inacces- 
sible (Huston & Huston, 2000). In addition, 
telemedicine data sharing is typically asymmetric in 
that the enterprise server transmits a larger volume 
of medical data to the PDA. In turn, the PDA 
transmits only a small volume of patient data to the 
server (Murthy & Krishnamurthy, 2004). 

Though PDAs hold great promise in promoting 
healthcare in remote regions, the usability of these 
devices continues to be an issue. There are physical 
constraints that typically do not apply to a laptop or 
desktop computer (Table 1 describes these con- 



straints). The user interface of a PDA is modeled 
after a desktop environment with little consideration 
for physical and environmental differences (Sacher 
& Loudon, 2002). Yet, these differences are signifi- 
cant in terms of usability given the small screen and 
keyboard sizes and limited screen resources in terms 
of memory and power reduction (Brewster, 2002). 

There has been important research on PDA 
usability, primarily in the effective use of its limited 
screen area. Early research focused primarily on the 
display of contextual information in order to mini- 
mize waste of the screen space while maximizing 
content (Kamba, Elson, Harpold, Stamper, & 
Sukariya as cited in Buchanan, Farrant, Jones, 
Thimbleby, Marsden, & Pazzani, 2001). More re- 
cent efforts are taking into account not only screen 
size, but navigation, download time, scrolling, and 
input mechanisms (Kaikkonen & Roto, 2003). 

PDA USABILITY AND TELEMEDICINE 

An important finding of usability research associ- 
ated with mobile technology is the need for usability 
testing beyond a simulated environment. Waterson, 
Landay, and Matthews (2002), in their study of the 
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usability of a PDA, found that usability testing should 
include both content and device design. Chittaro and 
Dal Cin (2002) studied the navigational structures of 
mobile user interfaces. Their research also identified 
the need for actual devices to be used in usability 
testing. Real-world constraints would take into ac- 
count screen-size and page-design issues, date entry 
using a built-in keypad, wireless accessibility, data 
transmission speeds, visual glare, background noise, 
and battery power, among others. 

Our initial findings also reflected the need for 
usability testing in the telemedical environment in 
which technology is used. We initiated research on 
the use of PDAs for monitoring diabetic patients 
living in remote regions of the United States (Becker, 
Sugumaran, & Pannu, 2004). Figure 1 illustrates one 
of our user-interface designs for the ViewSonic® 
PocketPC. This screen shows part of a foot form 
completed by a healthcare provider during a home 
visit. The data entered by the user is stored in a local 
database that can be transmitted wirelessly to an 
enterprise server. 

The PocketPC is used in this research because of 
its low cost and its support of relational database 
technology. It has a built-in digital camera, which is 
important because of the physical distance between 
a patient and a healthcare facility. Images of foot 
sores are taken during a home visit and stored in the 
local database residing on the PDA. These images 
become part of the patient’s history when transmit- 
ted to the enterprise server. Later, the images can be 
viewed by a clinician for the timely diagnosis and 
treatment of the sores. 

Our research has shown the technical feasibility 
of using PDA technology to gather data in the field 
during a home visit. However, more research is 
needed to address usability issues uncovered during 
the use of the PDA during simulated home visits. A 
significant finding in the use of PDA technology is 
that usability is tightly integrated with the technologi- 
cal challenges associated with it. One such challenge 
is the heavy reliance on battery power when PDAs 
are deployed in the field. When the battery no longer 
holds a charge, critically stored relational data may 
be irretrievable due to pull technology used to trans- 
mit data from a local source to a central one. 

As part of this research, the usability of multime- 
dia data formats is being studied to 
improve information access in a nomadic 



Figure 1. PDA used to gather data about a 
diabetic patient’s foot health 
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environment. For rapid diagnosis and treatment of 
casualties, multimedia data formats may prove criti- 
cal. In our work, images are being used to replace 
textual descriptions that would consume valuable 
screen space. Figure 1 illustrates this concept of 
using color codes to represent physical areas of the 
foot. As such, foot problems can be reported for 
each area by clicking on the list appearing on the 
right side of the screen. Audio capabilities are also 
being explored in providing helpful information that 
otherwise would be text based. Both of these mul- 
timedia capabilities are in the design phase and will 
be tested in future field studies. 



FUTURE TRENDS 

Table 2 identifies research opportunities associated 
with the use of PDAs in telemedicine. Much of 
what has been done in this area has focused on 
tracking patient histories. However, there are sig- 
nificant opportunities for real-time data retrieval 
and transmission using PDA technology. Clinicians 
could use a PDA, for example, to send prescriptions 
to pharmacies, receive lab reports, and review 
medical data for the diagnosis and treatment of 
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Table 2. Telemedicine research opportunities using PDA technology (Wachter, 2003) 



Diagnosis and 
Treatment 


Mobile decision support software would allow for data entry of patient 
symptoms with output providing a diagnosis and treatment plan. 


Patient Tracking 


Synchronizing a PDA with a hospital’s centralized data source would allow 
vital signs and other information to be gathered in real-time at the point of 
care. A clinician would have the capability of obtaining lab reports and test 
results once they have been entered into the system. 


Prescriptions 


A PDA would be used by a clinician to send a patient prescription to a 
pharmacy. This would minimize human error associated with interpreting 
handwritten prescriptions. It would also provide a centralized tracking system 
in order to identify drug interactions when multiple prescriptions for a patient 
are filled. 


Medical Information 


Clinicians would have access to information on medical research, drug 
treatments, treatment protocols, and other supporting materials. According to 
Wachter (2003), a leading clinical PDA technology vendor has converted 
more than 260 medical texts into PDA formats thus supporting this effort. 


Dictation 


PDAs support multimedia data including audio, images, and text. As such, 
clinicians would have an opportunity to record multimedia patient data 
directly linked to patient history data in a centralized source. 


Charge Capture 


Data entry into a PDA that is transmitted to a centralized source would 
provide the means for efficient billing of medical charges to a patient. 



patients. These devices could also be used to mini- 
mize human error associated with more traditional 
mechanisms of recording patient data. 

There are infrastructure challenges associated 
with the use of telemedicine in terms of technology 
acceptance and utilization. Chau and Hu (2004) 
point out that although telemedicine is experiencing 
rapid growth, there are organizational issues pertain- 
ing to technology and management. It is critical that 
organizational support is available throughout the 
implementation stages of telemedicine. Past experi- 
ence in the use of ICT with no infrastructural support 
resulted in failure. The effective management of 
telemedicine systems and supporting technologies is 
needed to address barriers to ICT acceptance by 
healthcare personnel and patients. As such, there 
are research opportunities in the organizational ac- 
ceptance and use of PDAs in a telemedical environ- 
ment. 

Security, safety, and social concerns have also 
been identified by Tarasewich (2003) as research 
challenges in the use of mobile technology. Though 
encryption and other security technologies can readily 
be used during the transmission of data, there re- 
mains the issue of security associated with lost or 
stolen PDAs. Given the memory, data storage, and 
other technological constraints of a PDA, research 
is needed on developing security mechanisms for 



localized data. Research is also needed on ensuring 
localized data remains private and is accessible only 
by authorized personnel. 

CONCLUSION 

The exponential growth of wireless and PDA tech- 
nologies has brought unprecedented opportunities in 
providing managed healthcare. For the military and 
others working in nomadic environments, PDA tech- 
nology offers the capability for rapid diagnosis and 
treatment of casualties. Regardless of location, 
healthcare personnel could be provided with real- 
time access to reference materials, patient lab re- 
ports, and patient history data. 

Though there is great promise in the use of PDAs 
for providing telemedical services, there is research 
needed in the usability of these devices. Multimedia 
data formats offer alternative interfaces to access- 
ing data, and research is needed to assess their 
impact on ease of use and understandability. In 
addition, technological constraints need to be studied 
in terms of their impact on device usability . Memory , 
data storage, transmission speeds, and battery life 
need to be considered as part of usability testing to 
assess the impact on rapid medical diagnosis and 
treatment. 
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There is a major challenge of moving from tradi- 
tional medical services and resources to an environ- 
ment that promotes PDA technology and 
telemedicine. The potential benefits are great in 
terms of ubiquitous helth care with no time or space 
constraints. However, widespread acceptance of 
PDA technology in a telemedical environment will 
only become achievable through the design of usable 
interfaces. 
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KEY TERMS 

Compatibility: The ability to transmit data from 
one source to another without losses or modifications 
to the data or additional programming requirements. 

Interoperability: The ability of two or more 
systems or components to exchange information and 
to use the information that has been exchanged 
(Institute of Electrical and Electronics Engineers 
[IEEE], 1990). 

Peripheral Devices: Hardware devices, sepa- 
rate from the computer’s central processing unit 
(CPU), which add communication or other capabili- 
ties to the computer. 

Personal Digital Assistant (PDA): A personal 
digital assistant is a handheld device that integrates 
computing, telephone, Internet, and networking tech- 
nologies. 



Telecare: The use of information and communi- 
cations technology to provide medical services and 
resources directly to a patient in his or her home. 

Telehealth: The use of information and commu- 
nications technologies to provide a broader set of 
healthcare services including medical, clinical, ad- 
ministrative, and educational ones. 

Telemedicine: The use of information and com- 
munications technologies to provide medical ser- 
vices and resources. 

Wireless Application Protocol (WAP): The 

wireless application protocol promotes the 
interoperability of wireless networks, supporting 
devices, and applications by using a common set of 
applications and protocols (http:// 
www.wapforum.org). 

ENDNOTES 

* This article is based on work supported by the 
National Science Foundation under Grant No. 
0443599. Any opinions, findings, and conclu- 
sions or recommendations expressed in this 
content are those of the author(s) and do not 
necessarily reflect the views of the National 
Science Foundation. 

1 PocketPCs, Palm Pilots, and other handheld 
devices are referred to as PDAs in this article. 
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INTRODUCTION 

Through a transducer device and the movements 
effected from a digital pen, we have a pen-based 
interface that captures digital ink. This information 
can be relayed on to domain-specific application 
software that interpret the pen input as appropriate 
computer actions or archive them as ink documents, 
notes, or messages for later retrieval and exchanges 
through telecommunications means. 

Pen-based interfaces have rapidly advanced 
since the commercial popularity of personal digital 
assistants (PDAs) not only because they are con- 
veniently portable, but more so for their easy-to- 
use freehand input modal that appeals to a wide 
range of users. Research efforts aimed at the latter 
reason led to modern products such as the personal 
tablet PCs (personal computers; Microsoft Corpo- 
ration, 2003), corporate wall-sized interactive 
boards (SMART Technologies, 2003), and the com- 
munal tabletop displays (Shen, Everitt, & Ryall, 
2003). 

Classical interaction methodologies adopted for 
the desktop, which essentially utilize the conven- 
tional pull-down menu systems by means of a 
keyboard and a mouse, may no longer seem appro- 
priate; screens are getting bigger, the interactivity 
dimension is increasing, and users tend to insist on 
a one-to-one relation with the hardware whenever 
the pen is used (Anderson, Anderson, Simon, 
Wolfman, VanDeGrift, & Yasuhara, 2004; Chong 
& Sakauchi, 2000). So, instead of combining the 
keyboard, mouse, and pen inputs to conform to the 
classical interaction methodologies for these mod- 
ern products, our ultimate goal is then to do away 
with the conventional GUIs (graphical user inter- 
faces) and concentrate on perceptual starting points 
in the design space for pen-based user interfaces 
(Turk & Robertson, 2000). 



BACKGROUND 

If we attempt to recognize the digital pen as the only 
sole input modal for digital screens, for both interfac- 
ing and archival modes purported within the same 
writing domain, we then require the conceptualization 
of a true perceptual user interface (PUI) model. 
Turk and Robertson (2000) discuss the main idea of 
having an alternative (graphical user) interface 
through the PUI paradigm as a nonhassled and 
natural way of communicating with the background 
operating system. It is subjective, and it concerns 
finding out and (to a certain extent) anticipating what 
users expect from their application environment. 
There are several reasons to utilize the PUI as an 
interactive model for the digital screen. Amongst 
some of the more prominent ones are the following: 

• To reintroduce the natural concept of commu- 
nication between users and their devices 

• To present an intelligent interface that is able to 
react accordingly (as dictated by the objective 
of the application program) to any input ink 
strokes 

• To redesign the GUI exclusively for perceptual 
conceptions 

Modern and networked interactive digital screens 
utilize the electronic pen’ s digital ink as a convenient 
way of interfacing with specially developed applica- 
tion programs, and go on to offer the visual commu- 
nication of opinions for multiple users. This is as a 
result of taking advantage of the pen-based environ- 
ment. For example, we want to reproduce the simple, 
customary blackboard and still be able to include all 
other functionalities that an e-board can offer. But 
by minimizing the number of static menus and but- 
tons (to accommodate new perceptual designs in 
accordance to the PUI standards), the resultant 
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“clean slate” becomes the only perceptual input 
available to users to relate to the background sys- 
tems. Here, we see two distinct domains merged into 
one: the domain to receive handwritings (or draw- 
ings) as the symbolic representation of information 
(termed technically as traces), and the domain to 
react to user commands issued through pull-down 
menus and command buttons. 

Based purely on the input ink traces, we must be 
able to decipher users’ intentions in order to cor- 
rectly classify which of the two domains it is likely to 
be in: either as primitive symbolic traces, or some 
sort of system command. Often, these two domains 
overlap and pose the problem of ambiguousness, a 
gray area that cannot be simply classified by means 
of straightforward algorithms. For instance, the back- 
ground system may interpret a circle drawn in a 
clockwise direction over some preexisting ink traces 
as a select command when in fact the user had 
simply intended to leave the circle as a primitive ink 
trace to emphasize the importance of his or her 
previously written points. Fortunately, this problem 
can be solved if the program can anticipate the 
intentions of its users (Wooldridge, 2002); however, 
this method necessitates the constant tracking of the 
perceptual environment and would require a more 
stringent and somewhat parallel structural construct 
in order to run efficiently (Mohamed, 2004b; 
Mohamed, Belenkaia, & Ottman, 2004). 

There are currently many works by authors that 
describe vividly the interpretations of these traces 
exclusively in either domain as well as in combina- 
tion of the two. In the trace-only domain, Aref, 
Barbara, and Lopresti (1996) and Lopresti, Tomkins, 
and Zhou’s (1996) collective research in dealing 
with a concentrated area of deciphering digital inks 
as hand-drawn sketches and handwritings, and then 
performing pictorial queries on them, is the result of 
their effective categorization of ink as a “first-class” 
data type in multimedia databases. Others like 
Bargeron and Moscovich (2003) and Gotze, 
Schlechtweg, and Strothotte (2002) analyze users’ 
rough annotations and open-ended ink markings on 
formal documents and then provide methods for 
resetting these traces in a more orderly, cross- 
referenced manner. On the opposite perspective, 
we see pilot works on pen gestures, which began 
even before the introduction of styluses for digital 
screens. They are purported on ideas of generating 



system commands from an input sequence of prede- 
termined mouse moves (Rubine, 1991). Moyle and 
Cockburn (2003) built simple gestures for the con- 
ventional mouse to browse Web pages quickly, as 
users would with the digital pen. As gesturing with the 
pen gained increasing popularity over the years, Long, 
Landay, Rowe, and Michiels (2000) described an 
exhaustive computational model for predicting the 
similarity of perceived gestures in order to create better 
and more comfortable user-based gesture designs. 

For reasons of practicality and application suit- 
ability, but not necessarily for the simplicity of 
implementation, well-developed tool kits such as 
SATIN (Hong & Landay, 2000) and TEDDY 
(Igarashi, Matsuoka, & Tanaka, 1999) combine the 
pen input modality for two modes: sketching and 
gesturing. The automatic classification of ink inputs 
directed for either mode do not usually include too 
many gestures, and these tools normally place heavier 
cognition loads on the sketching mode. We agree 
that incorporating a pen-based command gesture 
recognition engine, as a further evaluation of the 
input traces and as an alternative to issuing system 
commands for addressing this scenario, is indeed 
one of the most practical ways to solve the new 
paradigm problem. 

ISSUES ON REPRESENTING 
DIGITAL INK TRACES 

A trace refers to a trail of digital ink data made 
between a successive pair of pen-down and pen-up 
events representing a sequence of contiguous ink 
points: theYand Tcoordinates of the pen’s position. 
Sometimes, we may find it advantageous to also 
include time stamps for each pair of the sampled 
coordinates if the sampling property of the trans- 
ducer device is not constant. A sequence of traces 
accumulates to meaningful graphics, forming what 
we (humans) perceive as characters, words, draw- 
ings, or commands. 

In its simplest form, we define a trace as a set of 
(jc., y., t) tuples, deducing them directly from each 
complete pair of pen-down and pen-up events. Each 
trace must be considered unique and should be 
identifiable by its trace ID (identification). Figure 1 
depicts the object-oriented relations a trace has with 
its predecessors, which can fundamentally be de- 
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Figure 1. Hierarchical object-oriented instances that define a trace 




scribed as that of a shape interface (following the 
Java OOP conventions). 

A trace also consists of rendering information 
such as pen color, brush style, the bounding box, 
center of gravity, and so forth for matters of visual 
interfacings. These are represented inside the con- 
text information of a trace. Traces with similar 
context information can later be assembled (or clas- 
sified) together as trace groups. A normalised trace, 
on the other hand, is a filtered trace with removed 
noise and rationalized contents. It is used entirely in 
comparing techniques during the process of identify- 
ing and classifying pen gestures. On the temporal 
front, the timing associated when writing in free hand 
can be categorized as follows: 

• the duration of the trace, 

• its lead time, and 

• its lag time. 

Lead time refers to the time taken before an ink 
trace is scribed, and lag time refers to the time taken 



after an ink trace is scribed. This is illustrated in 
Figure 2. For a set of contiguous ink components S 
- { c , Cj, c 2 , . . . , c n } in a freehand sentence made up 
of n traces, we note that the lag time for the z th 
component is exactly the same as the lead time of 
the (z+l) th component; that is, lag(c.) = lead(c. +1 ). 
Consequently, the timings that separate one set of 
ink components apart from another are the first lead 
time lead(c 0 ) and the last lag time lag(c n ) in S.. 
These times are significantly longer than their in- 
between neighbors c, to c n . 

Most people write rather fast, such that the time 
intervals between intermediate ink components in 
one word are very short. If we observe a complete 
freehand sentence made up of a group of freehand 
words, we can categorize each ink component 
within those words into one of the following four 
groups. 

• Beginnings: Ink components found at the 
start of a freehand word 



Figure 2. Splitting up freehand writing into ink components on the time line 
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• Endings: Ink components found at the end of 
a freehand word 

• In-Betweens: Ink components found in the 
middle of a freehand word 

• Stand-Alones: Disjointed ink components 

The groups differ in the demarcations of their 
lead and lag times, and as such, provide for a way in 
which a perceptual system can identify them. Other 
forms of freehand writings include mathematical 
equations, alphabets or characters of various lan- 
guages, and signatures. 

W3C’ s (World Wide Web Consortium’ s) current 
InkML specification defines a set of primitive ele- 
ments sufficient for all basic ink applications (Russell 
et ah, 2004). Few semantics are attached to these 
elements. All content of an InkML document is 
contained within a single <ink> element, and the 
fundamental data element in an InkML file is the 
<trace> element. 



ISSUES ON REPRESENTING 
PEN GESTURES 

Pen gestures are the direct consequence of inter- 
preting primitive ink traces as system commands or 
as appropriate computer actions. A pen gesture is 
not, however, an instance of a trace. 

While it is entirely up to the interpreter program 
to extract the meaning from the inputs and applica- 
tion contexts it received, our guidelines to the above 
claim are based on the fundamentals of the recogni- 
tion algorithm of classifying gestures. Rubine ’ s ( 1 99 1 ) 
linear classifier algorithm is a straightforward dot 
product of the coefficient weights of a set of trained 
feature values with the same set of extracted fea- 
tures from a raw input trace (see Figure 3). 

Essentially, this means that gestures do not re- 
quire the storing of (jt , y , t.) tuples, but rather they 
should store the trained coefficient weights {c fl , c p 
..., c }, which were negotiated and agreed upon by 
all parties attempting to synchronize the generality 
of the interpretation mechanism. That is, we need to 
ensure that the numbers, types, and techniques of 
features agreed upon for extraction are standard- 
ized across the board before we can be sure of 
issues of portability between applications. 



Figure 3. Relationship between a gesture and a 
trace with n features 




The temporal relation that singles out stand-alone 
components and freehand gestures from their con- 
tinuous writing and drawing counterparts (traces) is 
the longer-than-average lead and lag times of a 
single-stroke ink trace, as shown in Figure 4. In this 
case, there is a pause period between samplings of 
ink components that results in significantly longer 
lead and lag times. 

Based on the facts so far, we can now realize that 
it is possible to tackle the recognition process of the 
overlapping domains by focusing on the temporal 
sphere of influence. It dictates the very beings of the 
digital inks as either traces or gestures without the 
need to segregate the common input writing canvas. 

ISSUES OF MOUNTING A PUI FOR 
THE INTERACTIVE DIGITAL-PEN 
ENVIRONMENT 

We require a robust architecture that can provide 
the necessary natural feedback loop between the 



Figure 4. Stand-alone ink components that can 
be interpreted as possible gestures 
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Figure 5. PUI model serving a sketching envi- 
ronment made up of a transducer and the digital 
pen 




interfacing and interpreting mechanisms, the users, 
and the foreground application in order to affect the 
process of anticipation in a PUI environment. The 
problem of ambiguousness of the overlapping be- 
tween two distinct ink domains (described previ- 
ously) still stands and needs to be solved. We point 
out here again that if a program is able to intelligently 
anticipate the intention of its users through the 
constant tracking of the perceptual input environ- 
ment, then that problem can be overcome. 

This brings about a typical agent-oriented ap- 
proach similar to that of a system utilizing interface 
agents. A notion that emphasizes adaptability, coop- 
eration, proactiveness, and autonomy in both design 
and run times engages agents for abstruse software 
development (Mohamed, 2003; Mohamed & 
Ottmann, 2003; Wooldridge, 2002). In our case, we 
tasked two semiautonomous agents to process input 
digital inks in parallel, with one serving in the trace- 
based domain (Ink agent) and the other in the 
gesture-based domain (Gesture agent). Both are 
expected to simultaneously track the input digital ink 
in the temporal sphere of influence. 

Figure 5 demonstrates the PUI model that we 
mounted to successfully work for the interactive 
digital-pen environment for the digital screen. It 
incorporates all of our previous discussions to ensure 
that the continuous tracking of all input digital inks is 



efficiently executed. A Decision agent is added 
between the two domain-specific agents and the 
foreground application for an added strength of 
decision making when drawing up percepts from the 
frontline agents. 

It is not very often that we see people gesturing 
to a control system in the middle of writing a 
sentence or drawing a diagram. So we can antici- 
pate, rather convincingly based on the lead and lag 
times obtained, that the latest ink component might 
be an instance of a trace rather than a gesture. 

Our analyses (Mohamed, 2004a) on handling the 
lead and lag times of the beginnings, endings, in- 
betweens, and stand-alones of the ink components 
led to the following findings. Based on key statistical 
concepts in the context of determining the best 
solution for our problem definition, we establish the 
alternative hypothesis Hj and its nullifying opposite 
H o as stated below. 

FI 0 : An ink component is not a symbolic trace. 

H : An ink component is a symbolic trace. 

The darkened regions in Figure 6, depicting the 
lead and lag time relationship and the probability of 



Figure 6. Lead and lag time relationships be- 
tween traces and gestures 
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an ink component being a trace, are the areas for the 
rejection of the alternative hypothesis H , while the 
lighter region is the area for acceptance of H r 
Figure 6 is made to correspond directly as a lookup 
table; one can use the input parameter c g (t lead , f ) 
and retrieve from the table an output probability of 
whether any ink components should be considered 
as symbolic traces given its lead and lag times (i.e., 
P (Trace I c o (t lcad , t ))) with an attached Hj (strong 
acceptance) or H Q (strong rejection). 

Two examples are given in Figure 6. The first, 
with an input parameter of c ] (1377, 1281), gets a 
probability value of P - 7.38xl0' 08 and is recom- 
mended for rejection. This means that the likelihood 
of the ink component c { being a symbolic trace is 
very slim, and further tests should be made to check 
if we could indeed upgrade it to a command gesture. 
The second, with an input parameter of c 2 (309, 
1011), receives P = 0. 1 164 and is recommended for 
acceptance. This is a clear-cut case that the ink 
component e a should definitely be considered as a 
symbolic trace. 

FUTURE TRENDS 

The lead and lag times resulting from freehand 
writings on digital boards are part of the ongoing 
process of managing, analysing, and reacting to all 
primitive ink data perceived from a writing environ- 
ment. We currently have in place a background 
process model (Figure 5) designed to actively assist 
running foreground applications tailored for the PUI 
paradigm. We believe that the temporal methods 
highlighted are statistically strong for influencing 
future decisions down the communication chains 
within the PUI model. As we expect to branch out 
from currently working with single-stroke gestures 
to incorporating multistroke gestures in our (interac- 
tive) PUI model, it is essential that we observe the 
constituents of any affected ink components through 
their geometric properties. Most of the literature 
reviewed so far point toward the further segmenta- 
tion of ink components into their primitive forms of 
lines and arcs for symbol recognition on a 2-D (two- 
dimensional) surface. 

By being able to understand what is represented 
on-screen from the sketches made out by the primi- 
tive ink traces, and being able to issue multistroke 



gesture commands, we can expect a better percep- 
tual environment for more interactive pen-based 
digital-screen interfaces. 

CONCLUSION 

The advent of pen-based input devices has clearly 
revealed a need for new interaction models that are 
different from the classical desktop paradigm. In- 
stead of the keyboard and mouse, the electronic 
pen’ s digital ink is utilized for a commodious way to 
visually communicate ideas within the vicinity of the 
digital screen. By developing a true perceptual user 
interface for interaction with these screens, by 
means of only having the digital pen as the sole input 
modal, we achieve a robust architecture made up of 
semiautonomous agents that are able to correctly 
anticipate users’ intentions on an invisible graphical 
user interface, treating the inputs convincingly as 
either written traces or gesture commands. 
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KEY TERMS 

Features: Information that can be gathered to 
describe a raw trace such as angles between sampled 
points, lengths, and the speed of the sketched trace. 

Gestures: Refers in our case to digital-pen 
gestures, which are movements of the hands while 
writing onto digital screens that are interpreted as 
system commands. 
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GUI (Graphical User Interface): Specifically 
involves pull-down menus for keyboard and mouse 
inputs. 

InkML: An XML (extensible markup language) 
data format for representing digital-ink data. 

Interactive Session: A period of communica- 
tion for the exchange of ideas by an assembly of 
people for a common purpose. 

Interactivity Dimension: The number of users 
that a single transducer, display system, or applica- 
tion software can support (by means of complete 
hardware or software simulations) during one par- 
ticular interactive session. 



Interface Agents: Semiautonomous agents that 
assist users with, or partially automate, their tasks. 

PUI (Perceptual User Interface): An invisible 
graphical user interface that engages perceptual 
starting points. 

Temporal Sphere of Influence: Tracking ac- 
tions while within a specified time domain. 

Trace: The resultant digital-ink representation 
made by movements of the hand using a digital pen 
on a digital screen. 

W3C (World Wide Web Consortium): An 

international consortium of companies involved with 
the Internet and the Web. 
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INTRODUCTION 

Before you read on make sure you have a photo. ..I 
will not answer to anyone I cannot imagine 
physically. Thanks. (Message posted to an online 
discussion forum) 

Individuals are increasingly employing Internet 
and communication technologies (ICTs) to mediate 
their communications with individuals and groups, 
both locally and internationally. Elsewhere, I have 
discussed current perspectives on the origins and 
impact of cyberculture(s) (Macfadyen, 2006a), theo- 
retical arguments regarding the challenges of inter- 
cultural communication in online environments 
(Macfadyen, 2006b), andrecent approaches to study- 
ing the language of cyberspace (Macfadyen, 
2006c) — the very medium of interpersonal and in- 
tragroup communication in what is, as yet, the 
largely text-based environment of cyberspace. Vir- 
tual environments might in some sense be viewed as 
a communicative “bottleneck” — a milieu in which 
visual and oral cues or well-developed relationships 
may be lacking, and in which culturally diverse 
individuals may hold widely different expectations of 
how to establish credibility, exchange information, 
motivate others, give and receive feedback, or cri- 
tique or evaluate information (Reeder, Macfadyen, 
Roche, & Chase, 2004). 

Anecdotal evidence, and a growing body of re- 
search data, indicate that the greatest challenge that 
online communicators (and especially novice online 
communicators) experience is that of constructing 
what they consider to be a satisfactory or “authentic” 
identity in cyberspace, and in interpreting those online 
identities created by others. Rutter and Smith (1998) 
note, for example, that in their study of a regionally- 
based social newsgroup in the UK, communicators 
showed a real desire to paint “physical pictures” of 
themselves in the process of identity construction, and 
frequently included details of physical attributes, age, 



and marital status. Moreover, authentic identity con- 
struction and presentation also appears to contribute 
to communicator’s perceptions of the possibility for 
construction of authentic “community” online. 

BACKGROUND 

As with the literature on many other aspects of ICTs 
(Macfadyen, Roche, & Doff, 2004), current litera- 
ture on the possibilities for “authentic” identity and 
community in cyberspace tends to offer either simple 
pessimistic condemnation (e.g., Blanco, 1999; Miah, 
2000) or optimistic enthusiasm (e.g., Levy, 2001, 
2001a; Michaelson, 1996; Rheingold, 2000; Sy, 2000). 
Perhaps it is not surprising that feelings run so high, 
however, when we consider that human questions of 
identity are central to the phenomenon of cyberspace. 
Levy reminds us that cyberspace is not merely the 
“material infrastructure of digital communications,” 
but also encompasses “the human beings who navi- 
gate and nourish that infrastructure” (2001, p. XVI). 
Who are these humans? How can we be sure? And 
how is the capacity for global communications im- 
pacting interpersonal and group culture, communi- 
cations and relationships? This article surveys re- 
cent theoretical and empirical approaches to think- 
ing about identity and community in cyberspace, and 
the implications for future work in the field of 
human-computer interaction. 

VIRTUAL IDENTITY AND 
COMMUNITY: CRITICAL THEMES 

Virtual Identity, Virtual Ethnicity, and 
Disembodiment 

Does the reality of “disembodied being” in cyberspace 
present a challenge to construction of identity? Key 
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theoretical arguments regarding identity in cyberspace 
revolve around questions of human agency, the 
degree to which individuals shape, or are shaped by 
the structures and constraints of the virtual world. 
Holmes (1998) argues, “human agency has radically 
changed in spatial, temporal and technological exist- 
ence” (p. 7); the emergence of cybercultures and 
virtual environments means, he suggests, that previ- 
ous perspectives on individuality as constituted by 
cognitive and social psychology may be less mean- 
ingful, especially as they do not consider aspects of 
space and time in the consideration of community 
and behaviour. Building on Holmes rethinking of 
social relations, other contributors to his edited col- 
lection Virtual Politics: Identity and Community 
in Cyberspace suggest that alterations in the nature 
of identity and agency, the relation of self to other, 
and the structure of community and political repre- 
sentation by new technologies have resulted in a loss 
of political identity and agency for the individual. 
Jones (1997) similarly questions whether public unity 
and rational discourse can occur in a space 
(cyberspace) that is populated by multiple identities 
and random juxtapositions of distant communica- 
tors. Fernanda Zambrano (1998) characterizes indi- 
viduals in virtual society as “technological termi- 
nals” for whom state and nation are irrelevant but 
actually sees disembodiment and deterritorialization 
of the individual as a strength, offering the possibility 
for “productive insertion in the world” beyond tradi- 
tional geographically-bound notions of citizenship. 
Offering decidedly more postmodern perspectives, 
Turkle (1995) suggests that a model of fragmented 
(decentred) selves may be more useful for under- 
standing virtual identity, using theoretical perspec- 
tives on identity from psychology, sociology, psycho- 
analysis, philosophy, aesthetics, artificial intelligence, 
and virtuality, and Poster (2001) proposed a new 
vision of fluid online identity that functions simply as 
a temporary and ever-changing link to the evolving 
cultures and communities of cyberspace. Others 
(see, for example, Miah, 2000; Orvell, 1998) are, 
however, less willing to accept virtual identity as a 
postmodern break with traditional notions of identity, 
and instead argue that virtual reality is simply a 
further “sophistication of virtualness that has always 
reflected the human, embodied experience” (Miah, 

2000, p. 211). 



This latter author, and others, point out that 
regardless of theoretical standpoint, virtuality poses 
a real and practical challenge to identity construc- 
tion, and a number of recent studies have attempted 
to examine tools and strategies that individuals em- 
ploy as they select or construct identity or personae 
online (Burbules, 2000; Jones, 1997 ; Smith & Kollock, 
1998). Rutter and Smith (1998) offer a case study of 
identity creation in an online setting, examining 
elements such as addressivity (who talks to whom) 
and self-disclosure, and how these elements contrib- 
ute to sociability and community. Jordan (1999) 
examines elements of progressive identity construc- 
tion: online names, online bios and self-descriptions. 

Interestingly, a number of authors focus explic- 
itly on the notion of “virtual ethnicity :” how individu- 
als represent cultural identity or membership in 
cyberspace. Foremost among these is Poster (1998, 
200 1 ) who theorizes about “the fate of ethnicity in an 
age of virtual presence” (p. 151). He asks whether 
ethnicity requires bodies — inscribed as they are with 
rituals, customs, traditions, and hierarchies — for true 
representation. Wong (2000) meanwhile reports on 
ways that disembodied individuals use language in 
the process of cultural identity formation on the 
Internet, and similarly, Reeder et al. (2004) attempt 
to analyze and record cultural differences in self- 
presentation in an online setting. In a related discus- 
sion, contributors to the collection edited by Smith 
and Kollock (1998) offer counter-arguments to the 
suggestion that as a site of disembodied identity, 
cyberspace may eliminate consideration of racial 
identity; instead, they suggest that cyberindividuals 
may simply develop new non visual criteria for people 
to judge (or misjudge) the races of others. 

Online identities may therefore be multiple, fluid, 
manipulated and/or may have little to do with the 
“real selves” of the persons behind them (Fernanda 
Zambrano, 1998; Jones, 1997; Jordan, 1999; 
Rheingold, 2000; Wong, 2000). Is “identity decep- 
tion” a special problem on the Internet? Some theo- 
rists believe so. Jones (1997) examines in detail the 
way that assumed identities can lead to “virtual 
crime”, while Jordan suggests that identity fluidity 
can lead to harassment and deception in cyberspace. 
Levy (2001), on the other hand, argues that decep- 
tion is no more likely in cyberspace than via any 
other medium, and even suggests that the cultures of 
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virtual communities actively discourage the irre- 
sponsibility of anonymity. 

Virtual Community, Virtual Culture, and 
Deterritorialization 

Are all communities — online and offline — virtual to 
some degree? In his classic text Imagined Commu- 
nities, Anderson (1991) argues that most national 
and ethnic communities are imagined because mem- 
bers “will never know most of their fellow-members, 
meet them, or even hear them, yet in the minds of 
each lives the image of their communion” (p. 5). 
Ribeiro (1995) extends Anderson’s model to argue 
that cyberculture, computer English and “electronic 
capitalism” are necessary internal characteristics of 
a developing virtual transnational community. 
Burbules (2000) nevertheless cautions us to remem- 
ber that “imagined” communities are also “real” , and 
undertakes a careful analysis of the notion of com- 
munity that ultimately situates virtual communities as 
“actual.” 

In the same way that virtual identities are disem- 
bodied, virtual communities are (usually) 
deterritorialized — a feature highlighted by a number 
of writers (Blanco, 1999; Sudweeks, 1998). Interest- 
ingly, Poster (2001) draws parallels between online 
virtual communities and other ethnicities that have 
survived in the absence of “a grounded space” — 
such as Jewishness. 

What, then, are the defining features of virtual 
communities? A number of theorists posit that virtual 
communities can best be described as a constantly 
evolving “collective intelligence” or “collective con- 
sciousness” that has been actualized by Internet 
technologies (Abdelnour-Nocera, 2002b; Guedon, 
1997; Levy, 2001a; Poster, 2001; Sudweeks, 1998). 

More common, however, are theoretical discus- 
sions of the construction of a group culture — and of 
shared identity and meaning — as a feature of virtual 
community (Abdelnour-Nocera, 2002a; Baym, 1998; 
Blanco, 1999; Levy, 2001; Porter, 1997; Walz, 2000). 
Essays contained in the collection edited by Shields 
(1996) examine the socio-cultural complexities of 
virtual reality and questions of identity, belonging and 
consciousness in virtual worlds. Abdelnour-Nocera 
(1998) suggests that Geertz’s (1973) idea of culture 
as a “web of meaning that he (man) himself has 



spun” (p. 194) is most useful when considering the 
construction of shared meaning in a community 
where language is the main expressive and inter- 
pretative resource. 

Virtual communities share a number of other 
common internal features, including: use and devel- 
opment of specific language (Abdelnour-Nocera, 
2002b); style, group purpose, and participant 
characteristics (Baym, 1998); privacy, property, 
protection, and privilege (Jones, 1998); forms of 
communication (Jordan, 1999); customary laws (e.g., 
reciprocity), social morality, freedom of speech, 
opposition to censorship, frequent conflicts, flaming 
as “punishment” for rule-breaking, formation of 
strong affinities and friendships (Levy , 200 1 ); unique 
forms of immediacy, asynchronicity, and anonymity 
(Michaelson, 1996); and internal structure and dy- 
namics (Smith & Kollock, 1998). 

Strikingly absent from most discussions of the 
creation and nature of online communities is much 
mention of the role of the language of communica- 
tion, and most contributions apparently assume that 
English is the language of cyberspace. If, as Adam, 
Van Zyl Slabbert, and Moodley (1998) argue, “lan- 
guage policy goes beyond issues of communication 
to questions of collective identity” (p. 107), we 
might expect to see more careful examination of the 
choice of language that different users and commu- 
nities make, and how this contributes to the sense of 
community online. Only very recently in a special 
issue of the Journal of Computer-Mediated Com- 
munication have research reports on the “multilin- 
gual internet” and discussions of online language 
and community begun to appear (see Danet & 
Herring [2003] and references therein). 

The Promises of Cybertechnology for 
Identity and Community: Hopes 
and Fears 

Papers assembled in the recent edited collection of 
Staid and Tufte (2002) present a diverse selection 
of developing engagements of cultural groups in 
what is characterized as the “global metropolis” of 
the Internet. Contributing authors report on media 
use by young Danes, by rural black African males 
in a South African university, by South Asian fami- 
lies in London, by women in Indian communities, by 
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Iranian immigrants in London, and by young immi- 
grant Danes. In particular, these contributions focus 
on the ways that these minority groups and commu- 
nities understand themselves vis-a-vis a majority 
culture, and the different ways that these groups 
utilize Internet technologies in the construction of 
complex identities. In Canada, Hampton and Wellman 
(2003) document an increase in social contact, and 
increased discussion of and mobilization around 
local issues in a networked suburban community. 

Does technology limit virtual identity and virtual 
community? Reeder et al. (2004) point to ways that 
(culturally biased) technological design of the spaces 
of virtual encounters implicitly shapes the nature of 
the communications that occur there. Poster (1998) 
examines the technological barriers to portrayal of 
ethnicity in different online settings, and Rheingold 
(2000) discusses how technology affects our social 
constructs. Blanco (1999) similarly argues that vir- 
tual communities are “rooted in the sociotechnological 
configuration that the Internet provides” (p. 193) but 
suggests that sociocultural innovations are in fact 
allowing a reciprocal alteration of technology de- 
sign. 

In addition to the worry that Internet technologies 
may change the nature of social interactions, a 
number of contributors raise other fears. Blanco 
( 1 999) worries that “communication is becoming an 
end in itself instead of a tool for political, social and 
cultural action” (p. 193). Jones (1998) also con- 
cludes that efforts to recapture lost community 
online are only partly successful, and that 
cybercommunities bring with them new and distinc- 
tive difficulties, and Miah (2000) similarly argues 
that the claimed emancipatory functions of 
cyberspace are over-stated and counter-balanced 
by the challenges to identity construction. Michaelson 
(1997) worries that “participation in online groups 
has the potential to diminish commitment to local 
communities” (p. 57), and also that the technological 
and cultural resources required for such new forms 
of community may contribute to new forms of strati- 
fication. Poster (2001) reports on angst about 
cyberspace as a destructive mass market that can 
potentially remove ownership of culture from ethnic 
groups. (Interestingly, LaF argue (2002) pursues this 
same question, asking “does the commodification of 
a cultural product, such as an exotic handicraft, 
safeguard social conventions within the communi- 



ties of their producers?” (p. 317). This author does 
not, however, explicitly view technologically-driven 
cultural change as a cause for concern, but rather 
develops a theoretical argument relating 
microeconomic engagement of handicraft produc- 
ers with mass markets to ongoing negotiation of 
individual and cultural identity.) 

On the other hand, optimists look to the potential 
of online community as a uniting force. Sy (2000) 
describes how new Filipino virtual communities rep- 
resent a form of cultural resistance to Western 
hegemonic encroachment. Bickel (2003) reports 
how the Internet has allowed the voices of otherwise 
silenced Afghan women to be heard, and the new 
leadership identities for women that this has brought 
about. Michaelson (1996) agrees that for some, 
virtual communities offer opportunities for greater 
participation in public life. Levy (200 1 a) is enthusi- 
astic about the nonhierarchical and free nature of 
deterritorialized human relationships, and Rheingold 
(2000) offers a number of examples of positive 
social actions and developments that have emerged 
from the establishment of virtual communities. 



FUTURE TRENDS 

Perhaps more significant for future research direc- 
tions are perspectives that highlight the real live 
complexities of technological implications for iden- 
tity and community. A number of authors, for ex- 
ample, caution against the notion of “simple substi- 
tution” of virtual relationships for physical relation- 
ships, and undertake a comparison of “real” and 
“virtual” communities and relationships (Burbules, 
2000; Davis, 1997; Levy, 2001; Miah, 2000; Porter, 
1997; Smith & Kollock, 1998). Importantly, Hamp- 
ton and colleagues (Hampton, 2002; Hampton, & 
Wellman, 2003) have very recently undertaken criti- 
cal and ambitious empirical studies of networked 
communities, in an attempt to test some of the 
optimistic and pessimistic predictions offered by 
theorists. Future work by these and other investiga- 
tors should illuminate in finer detail the differential 
uses, impacts and benefits of ICTs for diverse 
communities and populations over time, as ICTs 
cease to be “new” and become increasingly woven 
into the fabric of human societies. 
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CONCLUSION 

If, as many theorists argue, Internet and communi- 
cation technologies represent a genuine paradigm 
shift in human communications, or a transition from 
the modern to the postmodern, strong positive and 
negative reactions might be expected as individuals 
and communities grapple with the social implica- 
tions. Indeed, Guedon (1997) takes care to show 
how this polarized pattern of responses has been 
repeated, historically, with the successive appear- 
ances of new communications technologies (such as 
the telephone). Other writers also (Davis, 1997; 
Michaelson, 1996; Poster, 2001; Rutter & Smith, 
1998) explicitly compare and contrast optimistic and 
pessimistic perspectives on virtual identity and com- 
munity. 

As Holmes (1998) argues, social activity can 
now no longer be reduced to simple relations in 
space and time — a realization that offers a new 
challenge to the more positivistic social science 
approaches, since the object of study (social activ- 
ity) is now “eclipsed by the surfaces of electronically 
mediated identities” (p.8). While once can still study 
interaction with computers in situ, such studies will 
fail to examine the reality that individuals are now 
able to participate in multiple worlds whose borders 
and norms radically exceed those previously avail- 
able. 

Perhaps most relevant for the field of HCI is 
Jones’ (1998) contention that many (most?) current 
perspectives on ICT-mediated communications are 
rooted in a “transportation model” of communication 
in which control over movement of information is 
central. Mitigating against such models of ICT- 
mediated communication is the reality that “people 
like people”, and actively seek to maximize human 
interaction. Developers of ICTs succeed best, he 
argues, when they recognize this reality and “put 
technology in service of conversation rather than 
communication, in service of connection between 
people rather than connection between machines” 
(p. 32). 
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KEY TERMS 

Culture: Multiple definitions exist, including es- 
sentialist models that focus on shared patterns of 
learned values, beliefs, and behaviours, and social 
constructivist views that emphasize culture as a 
shared system of problem-solving or of making 
collective meaning. Key to the understanding of 
online cultures — where communication is as yet 
dominated by text — may be definitions of culture 
that emphasize the intimate and reciprocal relation- 
ship between culture and language. 

Cyberculture: As a social space in which hu- 
man beings interact and communicate, cyberspace 
can be assumed to possess an evolving culture or set 
of cultures (“cybercultures”) that may encompass 
beliefs, practices, attitudes, modes of thought, 
behaviours and values. 

Deterritorialized: Separated from or existing 
without physical land or territory. 

Disembodied: Separated from or existing with- 
out the body. 

Modern: In the social sciences, “modern” re- 
fers to the political, cultural, and economic forms 
(and their philosophical and social underpinnings) 
that characterize contemporary Western and, argu- 
ably, industrialized society. In particular, modernist 
cultural theories have sought to develop rational and 
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universal theories that can describe and explain 
human societies. 

Postmodern: Theoretical approaches charac- 
terized as postmodern, conversely, have abandoned 
the belief that rational and universal social theories 
are desirable or exist. Postmodern theories also 
challenge foundational modernist assumptions such 
as “the idea of progress,” or “freedom”. 
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INTRODUCTION 

Question Answering (QA) is one of the branches of 
Artificial Intelligence (AI) that involves the process- 
ing of human language by computer. QA systems 
accept questions in natural language and generate 
answers often in natural language. The answers are 
derived from databases, text collections, and knowl- 
edge bases. The main aim of QA systems is to 
generate a short answer to a question rather than a 
list of possibly relevant documents. As it becomes 
more and more difficult to find answers on the World 
Wide Web (WWW) using standard search engines, 
the technology of QA systems will become increas- 
ingly important. A series of systems that can answer 
questions from various data or knowledge sources 
are briefly described. These systems provide a 
friendly interface to the user of information systems 
that is particularly important for users who are not 
computer experts. The line of development of ideas 
starts with procedural semantics and leads to inter- 
faces that support researchers for the discovery of 
parameter values of causal models of systems under 
scientific study. QA systems historically developed 
roughly during the 1960-1970 decade (Simmons, 
1970). A few of the QA systems that were imple- 
mented during this decade are: 

• The BASEBALL system (Green et ah, 1961) 

• The LACT RETRIEVAL System (Cooper, 

1964) 

• The DELFI systems (Kontos &Kossidas, 1971; 

Kontos & Papakontantinou, 1970) 



The BASEBALL System 

This system was implemented in the Lincoln Labo- 
ratory and was the first QA system reported in the 
literature according to the references cited in the 
first book with a collection of AI papers (Feigenbaum 
& Feldman, 1963). The inputs were questions in 
English about games played by baseball teams. The 
system transformed the sentences to a form that 
permitted search of a systematically organized 
memory store for the answers. Both the data and the 
dictionary were list structures, and questions were 
limited to a single clause. 

The FACT RETRIEVAL System 

The system was implemented using the COMIT 
compiler-interpreter system as programming lan- 
guage. A translation algorithm was incorporated into 
the input routines. This algorithm generated the 
translation of all information sentences and all ques- 
tion sentences into their logical equivalents. 

The DELFI System 

The DELFI system answers natural language ques- 
tions about the space relations between a set of 
objects. These are questions with unlimited nesting 
of relative clauses that were automatically trans- 
lated into retrieval procedures consisting of general- 
purpose procedural components that retrieved infor- 
mation from the database that contained data about 
the properties of the objects and their space rela- 
tions. The system was a QA system based on 
procedural semantics. The following is an example 
of a question put to the DELFI system: 
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Figure 1. The objects of the DELFI I example 
application database 




“Is an object that has dotted contour below the 
object number 2?” 

The answer is, “Yes, object no. 4,” given the 
objects with numbers 1, 2, 3, 4, and 5 as shown in 
Figure 1 (Kontos, 2004; Kontos & Papakonstantinou, 
1970). 

The DELFI II System 

The DELFI II system (Kontos & Kossidas, 1971) 
was an implementation of the second edition of the 
system DELFI augmented by deductive capabilities. 
In this system, the procedural semantics of the 
questions are expressed using macro-instructions 
that are submitted to a macro-processor that ex- 
pands them with a set of macro-definitions into full 
programs. Every macro-instruction corresponded to 
a procedural semantic component. In this way, a 
program was generated that corresponded to the 
question and could be compiled and executed in 
order to generate the answer. DELFI II was used in 
two new applications. These applications concerned 
the processing of the database of the personnel of an 
organization and the answering of questions by 
deduction from a database with airline flight sched- 
ules using the following rules: 



• If flight F 1 flies to city C 1 , and flight F2 departs 
from city Cl, then F2 follows FI. 

• If flight FI follows flight F2, and the time of 
departure of FI is at least two hours later than 
the time of arrival of F2, then F 1 connects with 
F2. 

• If flight FI connects with flight F2, and F2 
departs from city Cl, and FI flies to city C2, 
then C2 is reachable from Cl. 

Given a database that contains the following 
data: 

• FI departs from Athens at 9 and arrives at 
Rome at 1 1 

• F2 departs from Rome at 14 and arrives at 
Paris at 15 

• F3 departs from Rome at 10 and arrives at 
London at 12 

If the question “Is Paris reachable from Ath- 
ens?” is submitted to the system, then the answer it 
gives is yes, because F2 follows FI, and the time of 
departure of F2 is three hours later than the time of 
arrival of FI . It should be noted also that FI departs 
from Athens, and F2 flies to Paris. 

If the question “Is London reachable from Ath- 
ens?” is submitted to the system, then the answer it 
gives is no, because F3 follows FI, but the time of 
departure of F3 is one hour earlier than the time of 
arrival of FI . It should be noted here that FI departs 
from Athens, and F3 flies to London. 

In Figure 2, the relations between the flights and 
the cities are shown diagrammatically. 



Figure 2. The relations between the flights and 
cities of the DELFI II example application 
(Kontos, 2003) 
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BACKGROUND 

The Structured Query Language (SQL) 
QA Systems 

In order to facilitate the commercial application of 
the results of research work like the one described so 
far, it was necessary to adapt the methods used to the 
industrial database environment. One important ad- 
aptation was the implementation of the procedural 
semantics interpretation of natural language ques- 
tions using a commercially available database re- 
trieval language. The SQL QA systems implemented 
by different groups, including the author’s, followed 
this direction by using SQL so that the questions can 
be answered from any commercial database system. 

The domain of an illustrative application of our 
SQL QA system involves information about different 
countries. The representation of the knowledge of 
the domain of application connected a verb like 
exports or has capital to the corresponding table of 
the database to which the verb is related. This 
connection between the verbs and the tables pro- 
vided the facility of the system to locate the table a 
question refers to using the verbs of the question. 
During the analysis of questions by the system, an 
ontology related to the domain of application may be 
used for the correct translation of ambiguous ques- 
tions to appropriate SQL queries. Some theoretical 
analysis of SQL QA systems has appeared recently 
(Popescu et al., 2003), and a recent system with a 
relational database is described in Samsonova, et al. 
(2003). 

QA From Texts Systems 

Some QA systems use collections of texts instead of 
databases for extracting answers. Most such sys- 
tems are able to answer simple factoid questions 
only. Factoid questions seek an entity involved in a 
single fact. Some recent publications on QA from 
texts are Diekema (2003), Doan-Nguyen and Kosseim 
(2004), Harabagiu et al. (2003), Kosseim et al. (2003), 
Nyberg et al. (2002), Plamondon and Kosseim (2002), 
Ramakrishnan (2004), Roussinof and Robles-Flores 
(2004), and Waldinger et al. (2003). Some future 
directions of QA from texts are proposed in Maybury 
(2003). An international competition between ques- 
tion answering systems from texts has been orga- 



nized by NIST (National Institute of Standards and 
Technology) (Voorhees, 2001). 

What follows describes how the information 
extracted from scientific and technical texts may be 
used by future systems for the answering of com- 
plex questions concerning the behavior of causal 
models using appropriate linguistic and deduction 
mechanisms. An important function of such sys- 
tems is the automatic generation of a justification or 
explanation of the answer provided. 

The ARISTA System 

The implementation of the ARISTA system is a QA 
system that answers questions by knowledge ac- 
quisition from natural language texts and was first 
presented in Kontos (1992). The ARISTA system 
was based on the representation independent method 
also called ARISTA for finding the appropriate 
causal sentences from a text and chaining them by 
the operation of the system for the discovery of 
causal chains. 

This method achieves causal knowledge extrac- 
tion through deductive reasoning performed in re- 
sponse to a user’s question. This method is an 
alternative to the traditional method of translating 
texts into a formal representation before using their 
content for deductive question answering from texts. 
The main advantage of the ARISTA method is that 
since texts are not translated into any representa- 
tion, formalism retranslation is avoided whenever 
new linguistic or extra linguistic prerequisite knowl- 
edge has to be used for improving the text process- 
ing required for question answering. 

An example text that is an extract from a medi- 
cal physiology book in the domain of pneumonology 
and, in particular, of lung mechanics enhanced by a 
few general knowledge sentences was used as a 
first illustrative example of primitive knowledge 
discovery from texts (Kontos, 1992). The ARISTA 
system was able to answer questions from text that 
required the chaining of causal knowledge acquired 
from the text and produced answers that were not 
explicitly stated in the input texts. 

The Use of Information Extraction 

A system using information extraction from texts 
for QA was presented in Kontos and Malagardi 
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(1999). The system described had as its ultimate aim 
the creation of flexible information extraction tools 
capable of accepting natural language questions and 
generating answers that contained information ei- 
ther directly extracted from the text or extracted 
after applying deductive inference. The domains 
examined were oceanography, medical physiology, 
and ancient Greek law (Kontos & Malagardi, 1999). 
The system consisted of two main subsystems. The 
first subsystem achieved the extraction of knowl- 
edge from individual sentences, which was similar to 
traditional information extraction from texts (Cowie 
&Lehnert, 1996; Grishman, 1997), while the second 
subsystem was based on a reasoning process that 
combines knowledge extracted by the first sub- 
system for answering questions without the use of a 
template representation. 

QUESTION ANSWERING FOR 
MODEL DISCOVERY 

The AROMA System 

A modern development in the area of Q A that points 
to the future is our implementation of the AROMA 
(ARISTA Oriented Model Adaptation) system. This 



system is a model-based QA system that may sup- 
port researchers for the discovery of parameter 
values of procedural models of systems by answer- 
ing what if questions (Kontos et al., 2002). The 
concept of what if questions are considered here to 
involve the computation data of describing the be- 
havior of a simulated model of a system. 

The knowledge discovery process relies on the 
search for causal chains, which in turn relies on the 
search for sentences containing appropriate natural 
language phrases. In order to speed up the whole 
knowledge acquisition process, the search algorithm 
described in Kontos and Malagardi (2001) was used 
for finding the appropriate sentences for chaining. 
The increase in speed results because the repeated 
sentence search is made a function of the number of 
words in the connecting phrases. This number is 
usually smaller than the number of sentences of the 
text that may be arbitrarily large. 

The general architecture of the AROMA system 
is shown in Figure 3 and consists of three sub- 
systems; namely, the Knowledge Extraction Sub- 
system, the Causal Reasoning Subsystem, and the 
Simulation Subsystem. All of these subsystems have 
been implemented by our group and tested with a 
few biomedical examples. The subsystems of the 
AROMA system are briefly described next. 



Figure 3. The AROMA system architecture (Kontos et al., 2003) 
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The Knowledge Extraction Subsystem 

This subsystem integrates partial causal knowledge 
extracted from a number of different texts. This 
knowledge is expressed in natural language using 
causal verbs such as regulate, enhance, and in- 
hibit. These verbs usually take as arguments entities 
such as entity names and process names that occur 
in the texts that we use for the applications. In this 
way, causal relations are expressed between the 
entities, processes, or entity-process pairs. 

The input texts are submitted first to a prepro- 
cessing module of the subsystem, which automati- 
cally converts each sentence into a form that shows 
word data with numerical information concerning 
the identification of the sentence that contains the 
word and its position in that sentence. This conver- 
sion has nothing to do with logical representation of 
the content of the sentences. It should be empha- 
sized that we do not deviate from our ARISTA 
method with this conversion. We simply annotate 
each word with information concerning its position 
within the text. This form of sentences is then 
parsed, and partial texts with causal knowledge are 
generated. 

The Causal Reasoning Subsystem 

The output of the first subsystem is used as input to 
the second subsystem, which combines causal knowl- 
edge in natural language form to produce answers 
and model data by deduction not mentioned explicitly 
in the input text. The operation of this subsystem is 
based on the ARISTA method. The sentence frag- 
ments containing causal knowledge are parsed, and 
the entity-process pairs are recognized. The user 
questions are processed, and reasoning goals are 
extracted from them. The answers to the user 
questions that are generated automatically by the 
reasoning process contain explanations in natural 
language form. All this is accomplished by the 
chaining of causal statements using prerequisite 
knowledge such as ontology to support the reasoning 
process. 



THE SIMULATION SUBSYSTEM 

The third subsystem is used for modeling the dynam- 
ics of a system specified on the basis of the texts 
processed by the first and second subsystems. The 
data of the model, such as structure and parameter 
values, are extracted from the input texts combined 
with prerequisite knowledge, such as ontology and 
default process and entity knowledge. The solution 
of the equations describing the system is accom- 
plished with a program that provides an interface 
with which the user may test the simulation outputs 
and manipulate the structure and parameters of the 
model. 



FUTURE TRENDS 

The architecture of the AROMA system is pointing 
to future trends in the field of QA by serving, among 
other things, the processing of what if questions. 
These are questions about what will happen to a 
system under certain conditions. Implementing sys- 
tems for answering what if questions will be an 
important research goal in the future (Maybury, 
2003 ). 

Another future trend is the development of sys- 
tems that may conduct an explanatory dialog with 
their human user by answering why questions using 
the simulated behavior of system models. A why 
question seeks the reason for the occurrence of 
certain system behaviors. 

The work on model discovery QA systems paves 
the way toward important developments and justi- 
fies effort leading to the development of tools and 
resources, aiming at the solution of the problems of 
model discovery based on larger and more complex 
texts. These texts may report experimental data that 
may be used to support the discovery and adaptation 
of models with computer systems. 

CONCLUSION 

A series of systems that can answer questions from 
various data or knowledge sources was briefly de- 



483 



Question Answering from Procedural Semantics to Model Discovery 



scribed. These systems provide a friendly interface 
to the user of information systems that is particularly 
important for users that are not computer experts. 
The line of development of systems starts with 
procedural semantics systems and leads to inter- 
faces that support researchers for the discovery of 
model parameter values of simulated systems. If 
these efforts for more sophisticated human-com- 
puter interfaces succeed, then a revolution may take 
place in the way research and development are 
conducted in many scientific fields. This revolution 
will make computer systems even more useful for 
research and development. 
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KEY TERMS 

Causal Chain: A sequence of instances of 
causal relations such that the effect of each instance 
except the last one is the cause of the next one in 
sequence. 

Causal Relation: A relation between the mem- 
bers of an entity-process pair, where the first mem- 
ber is the cause of the second member, which is the 
effect of the first member. 

Explanation: A sequence of statements of the 
reasons for the behavior of the model of a system. 

Model: A set of causal relations that specify the 
dynamic behavior of a system. 

Model Discovery: The discovery of a set of 
causal relations that predict the behavior of a sys- 
tem. 

Ontology: A structure that represents taxo- 
nomic or meronomic relations between entities. 

Procedural Semantics: A method for the trans- 
lation of a question by a computer program into a 
sequence of actions that retrieve or combine parts of 
information necessary for answering the question. 

Question Answering System: A computer 
system that can answer a question posed to it by a 
human being using prestored information from a 
database, a text collection, or a knowledge base. 

What //'Question: A question about what will 
happen to a system under given conditions or inputs. 

Why Question: A question about the reason for 
the occurrence of a certain system behavior. 
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INTRODUCTION 

Electronic commerce (EC) is, at first sight, an 
electronic means to exchange large amounts of 
product information between users and sites. This 
information must be clearly written since any users 
who accesses the site must understand it. Given the 
large amounts of information available at the site, 
interaction with an e-market site becomes an effort. 
It is also time-consuming, and the user feels disori- 
ented as products and clients are always on the 
increase. One solution to make online shopping 
easier is to endow the EC site with a recommender 
system. Recommender systems are implanted in EC 
sites to suggest services and provide consumers 
with the information they need in order to decide 
about possible purchases. These tools act as a 
specialized salesperson for the customer, and they 
are usually enhanced with customization capabili- 
ties; thus they adapt themselves to the users, basing 
themselves on the analysis of their preferences and 
interests. Recommenders rely mainly on user inter- 
faces, marketing techniques, and large amounts of 
information about other customers and products; all 
this is done, of course, in an effort to propose the 
right item to the right customer. Besides, 
recommenders are fundamental elements in sustain- 
ing usability and site confidence (Egger, 2001); 
that’s the reason why e-market sites give them an 
important role in their design (Spiekermann & 
Paraschiv, 2002). 

If a recommender system is to be perceived as 
useful by its users, it must address several problems, 
such as the lack of user knowledge in a specific 
domain, information overload, and a minimization of 
the cost of interaction. 

EC recommenders are gradually becoming pow- 
erful tools for EC business (Gil & Garcia, 2003) 
making use of complex mechanisms mainly in order 



to support the user’s decision process by allowing 
the analogical reasoning by the human being, and 
avoiding the disorientation process that occurs when 
one has large amounts of information to analyse and 
compare. This article describes some fundamental 
aspects in building real recommenders for EC. 

We will first set up the scenario by exposing the 
importance of recommender systems in EC, as well 
as the stages involved in a recommender-assisted 
purchase. Next, we will describe the main issues 
along three main axes: first, how recommender 
systems require a careful elicitation of user require- 
ments; after that, the development and tuning of the 
recommendation algorithms; and, finally, the design 
and usability testing of the user interfaces. Lastly, 
we will show some future trends in recommenders 
and a conclusion. 



BACKGROUND 

E-commerce sites try to mimic the buying and selling 
protocols of the real world. At these virtual shops, 
we find metaphors of real trade, such as catalogues 
of products, shopping carts, shop windows, and even 
“salespersons” that help us along the process (see 
http://www.vervots.com). 

There exist quite a number of models proposed to 
describe the real world customer-buying process 
applied to electronic trade; among these, we might 
propose the Bettman model (Bettman, 1979), the 
Howard-Sheth model (Howard & Sheth, 1994) or 
the AIDCA (Attention Interest Desire Conviction 
Action) model (Shimazu, 2002). The theory of pur- 
chase decision involves many complex aspects, among 
which one might include the psychological ones, 
those of marketing, social environment, and so forth. 
The behaviour of the buyer (Schiffman & Kanuk, 
1997) includes besides a wide spectrum of experi- 
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ences associated with the use and consumption of 
products and services: attitudes, lifestyles, sense of 
ownership, satisfaction, pleasure inside groups, en- 
tertainment, and so forth. 

Therefore, the fundamental goal today in EC is 
that of providing the virtual shops with all of the 
capabilities of physical trade, thus becoming a natu- 
ral extension of the traditional processes of buying 
and selling. One must provide these applications 
with dynamism, social and adaptive capacities in 
order to emulate traditional trade. The recommender 
system can supply the user with information related 
to the particular kind of shopping technique he or she 
is using. The most important phases that support the 
user’s decision can be resumed as follows: 

• Requirement of Identification: It permits 
the entry of every user into the system in an 
individual way, thus making it possible for the 
recommender to make use of a customized 
behaviour. 

• Product Brokering: The user, thus properly 
identified, interacts with the site in search of 
certain products and/or services; the searching 
process is facilitated by recommender sys- 
tems, which relieves the user from information 
overload and helps each concrete user to lo- 
cate the desired product. 

• Merchant Brokering: This type of buying 
mechanism comes into play when users want 
to acquire a certain product already known to 
them; at this moment, they look for the best 
offer for this precise item. The recommender 
systems make use of a comparison process, 
carry out the extraction of necessary informa- 
tion from different virtual shops, and work 
towards the goals established by the buyer 
(best price, best condition, etc.). 

• Negotiation: This aspect reflects the custom- 
ized interaction between the buyer and the site 
in the process of pre-acquisition, as well as the 
maintenance of these relations in post-sale 
process. This process is performed in the trans- 
action between the user and the site, and it 
depends on the purchasing needs of the user 
and on the sales policy of the site. The user 
must perceive negotiation as a transparent 
process. In order to benefit and activate the 
relationship with the site, one must facilitate 



fidelity policies. Also, one should avoid perva- 
sive recommendation or cross-sell, which the 
user could see as obtrusive and abusive meth- 
ods. This will consolidate the success of the 
virtual shop. 

• Confidence and Evaluation: Recommender 
Systems work with relevant information about 
the user. A significant part of this phase of 
approximation to the user is related to the 
safety in the transactions and the privacy of the 
information that the user hands over to the 
company. Besides, post-sale service is critical 
in virtual shops, both in the more straightfor- 
ward sense (an item is not acceptable) and in 
the sense of confirming the results of a given 
recommendation. This reinforces the confi- 
dence of the user in the site and integrates them 
in a natural way in the sales protocol. Confi- 
dence on a recommendation system relies on 
three fundamental premises (Hayes, Massa, 
Avesani, & Cunningham, 2002). Confidence in 
the recommender, assuming that it has suffi- 
cient information on our tastes and needs, also 
accepts that the recommender has knowledge 
on other possible alternatives. 

There exists a large number of recommenders 
over the Internet. These systems have succeeded in 
domains as diverse as movies, news articles, Web 
pages, or wines; especially well-known examples 
are the ones that we find in Amazon.com or 
BarnesAndNoble.com. 



MAIN ISSUES IN RECOMMENDATION 

The user’s interaction with the recommender sys- 
tem can be seen as two different but related pro- 
cesses. There is a first stage in which the system 
builds a knowledge base about the user (Input Stage 
we could say) and then a second stage in which a 
recommendation is made, at the same time taking 
notice of the user’s preferences (Output Stage). 
Normally, systems offer a set of possible products 
that try, perhaps without previous knowledge about 
the particular user, to extract some sort of first- 
approximation ratings. When the user makes further 
visits, both the data extracted from a first contact 
and the information garnered from any purchases 
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made will be of use when a recommendation is 
produced. By analyzing this information by means of 
different techniques (which we explain later), the 
systems are able to create profiles that are later to be 
used for recommendations. 

The second phase is the output stage, in which the 
user gets a number of recommendations. This is a 
complex phase, and one must take into account the 
fact that the user gets information about a given 
product, the ease with which the customer gets new 
recommendations, the actual set of new recommen- 
dations produces, and so forth. 

Let us now describe the real-world data that are 
extracted from the user and their importance from 
the point of view of usability. 

User Data 

The information about both user and domain defines 
the context in recommendation; it establishes how 
the various concepts can be extracted from a list of 
attributes that are produced by the interaction of the 
user with the system. These attributes must carefully 
be chosen for the sake of brevity, and also because 
they are the essential and effective information em- 
ployed as a user model in the specialized EC domain. 



User actions at the user interface, such as 
requiring some kind of information, scrolling or 
drilling down group hierarchies, and so forth, are 
translated into user preferences for different parts 
of the result, and fed back into the system to 
prioritize further processing. 

The following table contains the kinds of data 
that one can hope to extract from the user in the 
different phases of interaction with the system. 

A recommender system may take input for users 
implicitly or explicitly, or as a combination of both. 
Table 1 summarizes many of the attributes used for 
building a customized recommendation. This infor- 
mation, attending to the complex elaboration of the 
data extracted about users in the domain, can be 
divided into three categories: 

• Explicit: Expressed by the user directly (e.g., 
registration data as name, job, address and any 
other question and direct answer attributes) 

• Implicit: Captured by user interactivity with 
the system (e.g., purchase history, naviga- 
tional history). 

• Synthetic: Added by contextual techniques 
mixing explicit and implicit ones. The principle 
behind the elaboration of these data is that 



Table 1. Summary of information in EC for context building in recommendation 



Kind of 
data 


Information extracted from 


Yields... 






Name 

Gender 


*3 


Personal data 


Age 

Job 


a, 




Income 






Address 
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Related interests 
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Sequence of URLs visited 
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Date of purchase 
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Similar purchases related to content based 
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Price sensitivity 
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Purchasing possibilities, . . . 
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consumers attach values to all of the attributes 
of a service. The total value of a service is a 
function of the values of its components (e.g., 
parameters adding time-delay aspects in the 
model, semantic content traits in the items, 
etc.). The choice of the user and the recom- 
mendation is a function of the absence or 
incorporation of specific characteristics of the 
domain. These characteristics are taken into 
account using information regarding the con- 
tent of items, mainly of semantic content, in 
order to infer the reasons behind a user’s 
preferences. Recommendations suitable to a 
user will be context-dependent. The context of 
a user’s search often has a significant bearing 
on what should be recommended. 

Techniques Used for Recommendation 

Several different approaches have been considered 
for automated recommendation systems (Konstant, 
2004; Sarwar, Karypis, Konstan, & Riedl, 2000). 
These can be classified into three major categories: 
those based on user- to- user matching and referred 
to as collaborative filtering, those based on item 
content information, and hybrid methods referred to 
as knowledge-based systems. 

• Collaborative-social-filtering systems, 
which build the recommendation by aggrega- 
tion of consumers’ preferences: These kinds of 
systems try to find a match to other users, 
basing themselves on similarity in either behav- 
ioral or social patterns. The statistical analysis 
of data or data mining and knowledge discov- 
ery in databases (KDD) techniques (monitor- 
ing the behavior of the user over the system, 
ratings over the products, purchase history, 
etc.) build the recommendation by analogies 
with many other users (Breese, Heckerman, & 
Kadie, 1998). Similarity between users is com- 
puted mainly using the so-called user-to-user 
correlation. This technique finds a set of “near- 
est neighbors” for each user in order to identify 
similar likes and dislikes. Some collaborative 
filtering systems are Ringo (Shardanand & 
Maes, 1995) or GroupLens (Konstant, Miller, 
Maltz, Herlocker, Gordon, & Riedl, 1997). This 



technique suffers mainly from a problem of 
sparsity due to the need for a large volume of 
users in relation to the volume of items offered 
(critical mass) for providing appropriate sug- 
gestions. 

Content-based-filtering systems, which ex- 
tract the information for suggestions basing 
themselves on items the user has purchased in 
the past: These kinds of systems use super- 
vised machine learning to induce a classifier to 
discriminate between interesting or uninterest- 
ing products for the user due to her purchase 
history. Classifiers may be implemented using 
many different techniques from artificial intel- 
ligence, such as neural networks, Bayesian 
networks, inducted rules, decision trees, etc. 
The user model is represented by the classifier 
that allows the system to ponder the like or 
dislike for the item. This information identifies 
the more weighted items that will be recom- 
mended to the user. Some content-based sys- 
tems also use item-to-item correlation in order 
to identify association rules between items, 
implementing the co-purchase item or cross- 
sell. As an example, one could mention (Mooney 
& Roy, 2000) Syskill & Webert (Pazzani, 
Muramatsu, & Billsus, 1996), where a decision 
tree is used for classifying Web documents 
attending some content domain on a binary 
scale (hot and cold) or the well-known recom- 
mendation mechanism for the second or third 
item in Amazon. This technique suffers mainly 
from the problem of over-specialization; the 
consumer is driven to the same kind of items he 
has already purchased. Another important prob- 
lem comes also for recommending new articles 
in the store, as no consumers have bought this 
item before; hence, the system can’t identify 
this new item in any purchase history, and it 
cannot be recommended till at least one user 
buys this new article. 

Knowledge-based systems could be under- 
stood as a hybrid extended between collabora- 
tive-filtering and content-based systems. It 
builds the knowledge about users linked also 
with the products knowledge. This information 
is used to find out which product meets the 
user’s requirements. The cross-relationships 
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between products and clients produce infer- 
ences that build the knowledge in the EC en- 
gine. Several papers (Balabanovic & Shoham, 
1997; Paulson & Tzanavari, 2003; Shafer, 
Konstan, & Riedl, 2001) show the benefits of 
these systems. 

Usability in Recommender Systems 

User satisfaction with a recommender system is 
only partly determined by the accuracy of the algo- 
rithm behind it. The design of a recommender sys- 
tem is that of a user-centered tool, where personal- 
ization appears as one of the aspects of major weight 
in the capacities that the user perceives when inter- 
acting with the virtual shop. The development of a 
recommender system is the sum of several complex 
tasks, and usability questions arise. For this purpose, 
some standard questionnaires were created. IBM 
(Lewis, 1995) has suggested some of these: the 
Post-Study Systems Usability Questionnaire 
(PSSUQ), the Computer System Usability Ques- 
tionnaire (CSUQ), or the After-Scenario Question- 
naire (ASQ). Other examples were developed as 
the Questionnaire for User Interface Satisfaction 
(QUIS) (Chin, Diehl, & Norman, 1988), the System 
Usability Scales (SUS 1 ), developed in 1996 and 
whose questions all address different aspects of the 
user’s reaction to the Web site as a whole or the 
Web site analysis. Finally, one could also mention 
Measurement Inventory (WAMMI 2 ), developed in 
1999. 

Nowadays, there is a growing number of studies 
that examine the interaction design for recommender 
systems in order to develop general design guide- 
lines (Hayes et al., 2002; Swearingen & Sinha, 2001) 
and to test usability. Their aim is to find the factors 
which mostly influence the usage of the 
recommender systems. They consider such aspects 
as design and layout, functionality, or ease of use 
(Zins, Bauernfeind, Del Missier, Venturini, & 
Rumetshofer, 2004a, 2004b). 

The evaluation procedures that recommender 
systems must satisfy to obtain a level of usability are 
complex to carry out due to the various aspects to be 
measured. Some of the evaluation procedures apply 
techniques that comprise several other steps well 
known (Nielsen & Mack, 1994), such as concepts 
tests, cognitive walkthrough, heuristic evaluations, 



or experimental evaluations by system users. This 
could only be achieved by a cooperation of usability 
experts, real users, and technology providers. 

Usability in recommender systems can be mea- 
sured by objective and subjective variables. Objec- 
tive measures include the task completion time, the 
number of queries, or the error rate. Subjective 
measures include all other measures, such as user’s 
feedback or level of confidence in recommenda- 
tions, and the transparency level which the user 
perceives in recommendations (Sinha & Swearingen, 
2002 ). 

FUTURE TRENDS 

Significant research effort is being invested in build- 
ing support tools to ensure that the right information 
is delivered to the right people at the right time. A 
positive understanding of the needs and expecta- 
tions of customers is the core of any development. 
There is no universally best method for all users in all 
situations, and so flexibility and customization will 
continue to always be the engine of the development 
in recommender systems. The future in recommender 
systems in EC will be governed mainly by three 
fundamental aspects: the need for supporting more 
complex and heterogeneous dialog models, the need 
for attending internationalization and the standard- 
ization, and also the support in the nascent of a 
ubiquitous electronic commerce environment. 

Different players in the EC place operate with 
these contents. The problem increases with dialog 
models, all the more so between tools and automa- 
tion in the purchase or attending the analogical 
reasoning of human beings. Recommenders have to 
be sufficiently flexible to accomplish any ways of 
purchase. One of the key components for develop- 
ing an electronic commerce environment will be 
interoperability that could support an expertise de- 
gree in anything that e-market offers and a hetero- 
geneous environment; this will also require the use of 
international standards based on open systems. 

The EC market needs an international descrip- 
tive classification, as an open and interoperable 
standard that can be used for all the players in the 
interchange. The goal is to produce an appropriate 
mechanism to increase the efficiency of the man- 
agement in the EC site as well as the personalization 
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and specialization customer services offered in the 
recommendation. 

For the immediate future, the recommenders in 
EC need to incorporate new trends in the description 
of items and their connection with users (Flofmann, 
2004). The integration of the forceful expansion 
Web services infrastructure with the richer seman- 
tics of the semantic Web, in particular through the 
use of more expressive languages for service marks 
the beginning of a new era in EC, as it endows 
recommender tools with powerful content knowl- 
edge capabilities. 

These points for the future will work together 
also into the forceful ubiquitous electronic com- 
merce environment. These will require deploying a 
network capable of providing connectivity to a large 
user and service provider community through new 
devices. Applications beyond those currently envi- 
sioned will evolve, and sophisticated user interfaces 
that conveniently provide users with information 
about the services offered by this new environment 
will emerge. The ability to obtain user data in an 
unobtrusive way will determine the success of rec- 
ommendations in environments as daily as a tourist 
visit or just going to shop in the supermarket. 

CONCLUSION 

EC sites are making a big effort to supply the 
customer with tools that ease and enhance shopping 
on the Net. The effort to facilitate the user’s tasks 
in EC necessitates an understanding of consumer’s 
behavior in order to facilitate a personalized access 
to the large amount of information one needs to 
search and assimilate before making any purchases. 
The user-centered design in recommender systems 
improves the study of realistic models on discovering 
and maintaining the user decision processes nar- 
rated by inputs that help build user models. The need 
to build recommender systems whose aim is to 
improve effectiveness and perceived user satisfac- 
tion has produced a surge in usability studies; these 
are carried out by means of different procedures and 
through the application of various techniques. 

We point out the future importance of 
recommenders in EC sites, all the more so due to the 
inclusion of semantic Web models in EC and of new 
interface paradigms. 
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KEY TERMS 

Collaborative-Social-Filtering Recommender 
Systems: Technique based on the correlation be- 
tween users’ interest. This technique creates inter- 
est groups between users, based on the selection of 
the same. 

Content-Based-Filtering Recommender Sys- 
tems: Technique based on the correlation between 
item contents by statistical studies about different 
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characteristics. These techniques compute user- 
purchase histories in order to identify association 
rules between items. 

Direct Recommendation: This kind of recom- 
mendation is based on a simple user request mecha- 
nism in datasets. The user interacts directly with the 
system that helps him in the search of the item 
through a list with the n-articles that are closest to his 
or her request in relation to a previously-known 
profile. 

Information Overload: Undesirable or irrel- 
evant information that disturbs the user and distracts 
him or her from the main objective. This kind of 
problem usually occurs in contexts that offer exces- 
sive amounts of information, badly handled due to 
low usability systems. 

Knowledge-Based Systems: A hybrid extended 
technique between collaborative-filtering and con- 
tent-based systems. It builds knowledge about users 
by linking their information with knowledge about 
products. 



Pervasive Recommendation: Unsolicited in- 
formation about products or services related with 
the one requested. They are usually shown as adver- 
tising or secondary recommendations, acting as 
fillers for the page or as new elements in the 
interface; they could be perceived as disturbing 
elements. The system of inner marketing establishes 
a policy of publicity for each product destined to 
given segments of consumers. This provides a method 
to perform cross-sell marketing. 

Recommender Systems in E-Commerce: 

Tools implanted in EC sites for suggesting services 
and in order to provide consumers with the needed 
information to decide about services to acquire. 
They are usually domain-specialized tools. 

ENDNOTES 

1 http://www.usability.serco.com/trump/docu- 
ments/Suschapt.doc 

2 http ://ww w . ucc .ie/hfrg/questionnaires/wammi/ 
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INTRODUCTION 

HCI-related subjects need to be considered to make 
e-learning more effective; examples of such sub- 
jects are: psychology, sociology, cognitive science, 
ergonomics, computer science, software engineer- 
ing, users, design, usability evaluation, learning styles, 
teaching styles, communication preference, person- 
ality types, and neuro-linguistic programming lan- 
guage patterns. This article discusses the way some 
components of HI can be introduced to increase the 
effectiveness of e-learning by using an intuitive 
interactive e-learning tool that incorporates commu- 
nication preference (CP), specific learning styles 
(LS), neurolinguistic programming (NLP) language 
patterns, and subliminal text messaging. The article 
starts by looking at the current state of distance 
learning tools (DLTs), intelligent tutoring systems 
(ITS) and “the way we learn”. It then discusses HI 
and shows how this was implemented to enhance the 
learning experience. 

BACKGROUND 

In this section, we briefly review the current states 
in DLT and ITS. 

The general accepted standard, with current 
DLTs, is that the learner must be able to experience 
self-directed learning, asynchronous and synchro- 
nous communication (Janvier & Ghaoui, 2002a, 
2003a). 

Bouras and Philopulos (2000) in their article 
consider that “distributed virtual learning environ- 
ment,” using a combination of HTML, Java, and the 
VRML (Virtual Reality Modelling Language), makes 
acquiring knowledge easier by providing such facili- 



ties as virtual chatrooms for student-student-teacher 
interaction, lectures using the virtual environment, 
announcement boards, slide presentations, and links 
to Web pages. 

People’ s experience (including ours) of a number 
of DLTs was that, while they achieved an objective 
of containing and presenting knowledge extremely 
well, the experience of using them fell far short of 
normal HI, was flat, and gave no rewarding motiva- 
tion. The user had to accept a standard presentation 
that did not vary from user to user; briefly there was 
no real system that approached HI, and, thus, the 
learning experience lacked the quality that was 
required to make it as effective as it should be. 

Similarly with ITS, they are normally built for a 
specific purpose with student modelling being devel- 
oped from the interaction between the student and 
the system. 

Murray (1997) postulates that while ITS, also 
called knowledge-based tutors, are becoming more 
common and proving to be increasingly effective, 
each one must still be built from scratch at a signifi- 
cant cost. Domain independent tools for authoring all 
aspects of ITS (the domain model, the teaching 
strategies, the learner model, and the learning envi- 
ronment) have been developed. They go beyond 
traditional computer-based instruction in trying to 
build models of subject matter expertise, instruc- 
tional expertise, and/or diagnostic expertise. They 
can be powerful and effective learning environ- 
ments; however, they are very expensive in time and 
cost, and difficult to build. 

Nkambou and Kabanza (2001) report that most 
recent ITS architectures have focused on the tutor 
or curriculum components but with little attention 
being paid to planning and intelligent collaboration 
between the different components. They suggest 
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that the ideal architecture contains a curriculum 
model, a tutor (pedagogical) model, and a learner 
model: This last is central to an ITS. 

To move forward, e-learning requires a combina- 
tion of both; however, Murray (1999), in common 
with many other researchers, believes that ITS are 
too complex for the untrained user and that: 

we should expect users to have a reasonable 
degree of training in how to use them, on the 
order of database programming, CAD-CAM 
authoring, 3-D modelling, or spreadsheet macro 
scripting. 

In e-learning, the development has taken two 
routes: that of the DLT and that of the ITS. With 
both, there is no effort to pre-determine the student’ s 
psyche before the system is used, and thus the basic 
problem of HI replication in HCI has not been 
instigated at the inception of an e-learning session. 

MAIN ISSUES IN 
HUMAN INTERACTION 

In this section, we discuss communication prefer- 
ence, personality types, neurolinguistic program- 
ming, NLP language patterns, and subliminal text 
messaging. 

Communication Preference (CP) 

Each person has a preference in the way he or she 
communicates with others; they also have prefer- 
ences in the way to learn or pass on information to 
someone else: This is called communication prefer- 
ence. Learning is introduced by one of the five 



senses (touch, sight, taste, hearing, and smell) and 
initially passes into the subconscious sensual memory 
from their sensual memory to short-term memory 
and then, usually viarehearsal to long-term memory. 
All input into short-term memory is filtered, inter- 
preted, and assessed against previously input, be- 
liefs, and concepts using perceptual constancy, per- 
ceptual organization, perceptual selectivity, and per- 
ceptual readiness. Cue recognition allows for memory 
to pick out the key points that link to further memory 
recall and practice a skill using cognitive, psycho- 
motor and perceptual skills (Cotton, 1995). 

Stored instances (single items of memory) do not 
necessarily represent actuality due to the fact that 
they have already been distorted by the subject’s 
own interpretation of the facts as perceived by their 
“inner voice, eye, ear, nose, and taste.” Initially, 
instances are stored in short-term memory where 
the first and last inputs of a stream of instances are 
easier to recall: These can then be transferred to 
long-term memory by rehearsal. Different individu- 
als use their preferred inner sense to aid perception. 
For learning to be effective, new instances are 
associated with existing instances. The use of the 
working memory constantly improves and refines 
long-term memory; indeed, practical “day-dream- 
ing” is a form of forward planning that can improve 
retention (Cotton, 1995). 

Iconic sensory input is the most important for 
remembering and learning. Cotton (1995) showed 
that using an A4 sheet divided into sections with 
questions and answers aided storing — linking this 
with sound, further increased retention in long-term 
memory. Research shows a link between good 
recall and good recognition, and that memory is 
seldom completely lost: It only requires specific cues 
to connect linked instances and bring them to the 
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level of conscious thought. Here associations, self- 
testing, sectional learning, rhyme rules, mnemonics, 
spelling associations, memory systems, networks of 
memory patterns (both linear and lateral) are used to 
develop cue recognition (Catania, 1992;Cotton, 1995). 

Borchert, Jensen, and Yates (1999) report that 
the visually (V) oriented students prefer to receive 
information, via their eyes, in charts, graphs, flow 
charts, and symbolic representation, the aural (A) 
orientated indicates a preference for hearing infor- 
mation, and the kinaesthetic (K) orientated student 
prefers “learning by doing” either learning by simu- 
lated or real-world experience and practice. 

Personality Types 

Different personality types require different commu- 
nication treatment. Extroverts (E) and introverts (I) 
can be viewed as one couplet, slow (S) and quick (Q) 
decision makers as another couplet. All personality 
types can be plotted somewhere on the resultant El/ 
SQ scale. 

• ES types prefer the company of others, are 
slow to make decisions, they take their time, and 
sometimes will not make a final commitment. 
They can be categorized as “arty. ” 

• IS types prefer their own company, are slow to 
make decisions, they take their time, and some- 
times will not make a final commitment; they 
are very precise. They are often categorized 
as “analytical. ” 

• EQ types prefer the company of others, are 
fast to make decisions, and often make a com- 
mitment before thinking it through. They are 
often categorized as “salesmen. ” 

• IQ types prefer their own company, are fast to 
make decisions, and often make a commitment 
before thinking it through. They are often cat- 
egorized as “businessmen. ” (Fuller, Norby, 
Pearce, & Strand, 2000; Janvier & Ghaoui, 
2003b; Myers & Myers, 1995) 

Research has shown that when a student joins 
higher education, his or her primary personality type 
and thus learning style has been established (Wilson, 
Dugan, & Buckle, 2002). By this time, the introvert 
has usually learned to use their auxiliary or, maybe, 
even their tertiary Jungian Function and, thus, tend to 



hide their true primary personality type: The un- 
wary tutor can use inappropriate communication 
techniques with resulting frustration (Janvier & 
Ghaoui, 2003b; Wilson et al., 2002). 

Neurolinguistic Programming 

The name neurolinguistic programming (NLP) 

comes from the disciplines that influenced the early 
development of its field, beginning as an exploration 
of the relationship between neurology, linguis- 
tics, and observable patterns (programs) of 
behaviour. Combining these disciplines, NLP can 
be defined as: 

The reception, via our nervous system, of 
instances received and processed by the five 
senses (iconic, echoic, haptic, gustatory, and 
olfactory), the resultant use of language and 
nonverbal communication system through which 
neural representation are coded, ordered, and 
given meaning using our ability to organize our 
communication and neurological systems to 
achieve specific desired goals and results, 

Or more succinctly, “The Study of the Structure 
of Subjective Experience and what can be calcu- 
lated from it” (Janvier & Ghaoui, 2002b; Pasztor, 
1998b; Sadowski & Stanney, 1999; Slater, Usoh, & 
Steed, 1994). 

John Grinder, a professor at UC Santa Cruz 
and Richard Bandler, a graduate student, devel- 
oped NLP in the mid-70s. They were interested in 
how people influence one another, in the possibility 
of being able to duplicate the behavior, and thus the 
way people could be influenced. They carried out 
their early research in the University of Califor- 
nia at Santa Cruz where they incorporated tech- 
nology from linguistics and information science, 
knowledge from behavioural psychology and gen- 
eral systems theory developing their theories on 
effective communication. As most people use the 
term today, NLP is a set of models of how commu- 
nication impacts and is impacted by subjective 
experience. It’s more a collection of tools than any 
overarching theory. Much of early NLP was based 
on the work of Virginia Satir, a family therapist; 
Fritz Peris, founder of Gestalt therapy; Gregory 
Bateson, anthropologist; and Milton Erickson, hyp- 



496 



Replicating Human Interaction to Support E-Learning 



notist. - Stever Robbins, NLP Trainer (Bandler & 
Grinder, 1981). 

NLP Language Patterns 

Craft (2001) explores relationships between NLP 
and established learning theory and draws a distinc- 
tion between models, strategies, and theories. Craft 
argues that, while NLP has begun to make an impact 
in education, it still remains a set of strategies rather 
than a theory or model. NLP research has shown 
that this set of strategies results in increased memory 
retention and recall, for example: 

Pasztor (1998a) quotes the example of a student 
with a visual NLP style whose tutorial learning 
strategy was based on “listen, self-talk” and sport- 
learning strategy was “listen, picture, self-talk.” The 
former did not achieve memory store/recall while 
the latter did. She also reports that rapport with a 
partner is the key to effective communication and 
that incorporating NLP in intelligent agents will 
allow customization of the personal assistant to the 
particular habits and interests of the user thus mak- 
ing the user more comfortable with the system. 

Introducing the correct sub-modality (visual, au- 
ditory, kinaesthetic) will enable the subject to more 
easily store and recall instances in/from memory. It 
is argued that inviting a subject to “see” invokes 
iconic , to “hear” invokes auditory and to “feel” 
invokes kinaesthetic recall (Pasztor, 1997). 

Subliminal Text Messaging 

A subliminal text message is one that is below the 
threshold of conscious perception and relates to 
iconic memory (the persisting effects of visual 
stimuli). After-effects of visual stimuli are called 
icons. Iconic memory deals with their time courses 
after the event. Subliminal images and text (instance 
input that the conscious mind does not observe but 
the subconscious does) can have a powerful effect 
on memory and cognitive memory. “Unconscious 
words are pouring into awareness where con- 
scious thought is experienced, which could from 
then on be spoken [the lips] and/or written down” 
(Gustavsson, 1994). The time course of iconic 
memory is measured over fractions of seconds, but, 
in this time, the subject retains images that no longer 



are there (e.g., the imposition of fast changing still 
images on the retina create the effect of motion). 

Johnson and Jones (1999) affirm, “participants 
in text based chats rooms enjoy their anonymity 
and their ability to control the interaction pre- 
cisely by careful use of text ” and that the nature of 
such interactions may well change dramatically if 
the participants could observe each other and their 
mind state (Pasztor, 1997, 1998b; Sadowski & 
Stanney, 1999). 

AN INTUITIVE INTERACTIVE 
MODEL TO SUPPORT E-LEARNING 

In this section, we discuss a model called WISDeM 
and its evaluation, the results, and the statistical 
comparison of these results. 

WISDeM 

WISDeM (Web intuitive/interactive student e-learn- 
ing model) has further been developed into an intui- 
tive tool to support e-learning; it has combined these 
HI factors: CP, personality types, learning styles 
(LS), NPL language patterns, motivational factors, 
novicelexpert factor, and subliminal text messaging. 

LS research has shown that there are more 
facets to learning than a simple communication 
preference; Keefe (1987) defines LS as, 

...the set of cognitive, emotional, characteristic 
and physiological factors that serve as relatively 
stable indicators of how a learner perceives, 
interacts with, and responds to the learning 
environment... 

Knowing your LS and matching it with the cor- 
rect teaching strategies can result in more effective 
learning and greater academic achievement (Hoover 
& Connor, 2001). 

Fuller et al. (2000) in their research posed the 
question, “Does personality type and preferred 
teaching style influence the comfort level for 
providing online instruction? ” ; their answer was 
“Yes”. They outlined the teaching styles prefer- 
ences for the Myers-Briggs Type Indicators® — 
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Extroversionl Introversion, Sensing liNtuition, 
ThinkingIFeeling, JudgementIPerception, and pro- 
vided some suggestions for faculty development for 
seven of the sixteen MBTI° types (ESTJ, ESTP, 
ESFJ, ESFP, ENTJ, ENTP, ENFJ, ISTJ, ISTP, ISFJ, 
ISFP, INTJ, INTP, INFJ, INTJ). 

Montgomery and Groat (1998) point out that 
“matching teaching styles to LS is not a panacea 
that solves all classroom conflicts, ” that other 
factors such as the student’s motivation, pre-con- 
ceptions, multicultural issues, and so forth, also 
impinge on the student’ s quality of learning; but, that, 
nonetheless, understanding and reacting to LS in 
teaching enhances the quality of learning and re- 
wards teaching. 

Initially, WISDeM uses two psychometric ques- 
tionnaires based on the concepts, principles re- 
searched covering VARK (Fleming, 2001), Jungian 
Functions (Murphy, Newman, Jolosky, & Swank, 
2002; Wilson et al. , 2002) and MBTI® (Larkin-Hein 
& Budny, 2000; Murphy et al., 2002; Myers & 
Myers, 1995). 

It creates the initial student profile “before” the 
student accesses module learning material to enable 
effective HCI interaction. After the initial login, the 
student completes the CP questionnaire from which 
the system establishes if he or she is visual, auditory 
or kinaesthetic. A relevant report is output and the 
student opens the LS questionnaire. The questions in 
this are couched using text that matches his or her 
CP. Upon completion of this questionnaire, an LS 
report is produced, and, provided the student agrees, 
the initial student profile is saved in the CPLS 
database. 

As the student uses the system, his or her CPLS, 
together with a novicelexpert (NE) is used to create 
and update a unique student model (SM). The 
system’s algorithms use this SM and the updated 
student’s knowledge state to retrieve and display 
relevant messages and information in the interface. 
The NE factor is dynamically moderated as a stu- 
dent moves through a topic and reverts to the default 
value when a new topic is started. 

The tool built allows a student to select any topic 
for revision. In topic revision, the student can select 
either “LEARN” or “TEST” knowledge for anyone 
or a series of lectures as the module develops. 

The system’s use of repetitive header messages 
invokes subliminal text messaging: The student skips 



over previously noticed information, his or her “I’ ve 
seen this before,” or “Have I seen something like it 
before?” filter kicks in leading to conscious or 
subconscious rejection (Catania, 1992; Gustavsson, 
1994). The unconscious observation of the NLP 
language patterns matching his CP is effective: his 
or her eyes scan the page, take in the displayed 
information at the subliminal level, while he or she 
consciously notices what he or she wants to see. 

Evaluation 

The evaluation was a systematic and objective ex- 
amination of the completed project. It aimed to 
answer specific questions and to judge the overall 
value of the system, to provide information to test the 
hypothesis, “Matching neurolinguistic language 
patterns in an online learning tool, with the 
learner’s communication preference and learn- 
ing styles, will provide an intuitive tutoring sys- 
tem that will enhance Human-Computer Interac- 
tion and communication and, thus, enhance the 
storing and recall of instances to and from the 
learner’s memory; thus enhancing learning,” 
supporting the hypothesis (Hj or not, the null hy- 
pothesis (H q ). 

Statistically, the null hypothesis (H : p = 0) states 
that there is no effect or change in the population and 
that the sample results where obtained by chance, 
whereas the alternative hypothesis (H : p > 0) 
affirms that the hypothesis is true ( H= hypothesis, p 
= population). P-values were used to test the hy- 
pothesis, where a null hypothesis ff/ j is accepted or 
failed. The p-value represents the probability of 
making a Type 1 error, which is rejecting H when it 
is true. The smaller the p-value, the smaller is the 
probability that a mistake is being made by rejecting 
H . A cut-off value of 0.05 is used: values >= 0.05 

o 

means that the H should be accepted, values < 0.05 
suggest that the alternative hypothesis \HJ needs 
consideration and that further investigation is war- 
ranted, values <=0.01 are a strong indication that H 
is valid. 

To ensure the maximum integrity in sampling, 
simple random sampling (sampling without re- 
placement) was used ( “ Simple random sampling is 
the sampling design in which n distinct units are 
selected from N units in the population in such a 
way that every possible combination of the n 
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Figure 2. Interactive group evaluation flow chart 
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units is equally likely to be the sample selected” 
(Thompson, 1992)). This provided the probability 
that any student can participate without any prefer- 
ence, and, therefore, the sample was more likely to 
reflect the whole rather than if any other sampling 
method had been used (Clarke & Cooke, 1978; 
Thompson, 1992; Yates, Moore, & Starnes, 2002). 

The evaluation required two tests, control and 
interactive : Thus, the statistical analysis was re- 
quired to compare two means. The importance of 
the sample selection is paramount: Stronger results 
are obtained where each group is matched. Thus, the 
subjects were selected from students in one particu- 
lar module; this ensured that there was a match in 
year, course, age group (20’ s), but not sex (the 
random selection of students reflected the class 
spread of 84.13% male and 15.87% female). An- 
other factor considered was the fact that the sample 
sizes would be unequal: The test was run in two 
sections requiring circa two hours in total. Risk 
factor analysis suggested that some students would 
complete only one part of the evaluation: Thus, Two- 
Sample T-test was used. 

CPLS Evaluation Results 

The CPLS evaluation had 97 responders (86 male, 
1 1 female). Their % communication preferences 
were: 

Visual 67.01 Auditory 27.84 Kinaesthetic 5.15 



Communication Preference is reported gener- 
ally as: 

V = 60% A = 30% K = 10%. 

The evaluation group’s averages were: 

V = 67.01% A = 27.84% K = 5.15%. 

These compare quite well with previously re- 
ported research (Brown, 2001; Catania, 1992; Cot- 
ton, 1995; Janvier & Ghaoui, 2003b). As the group 
shows a stronger tendency to visualization, this 
should reflect in memory retention data being stron- 
ger for this group than for the average due to the fact 
that the lecture content is based mainly on a visual 
presentation and auditory delivery styles. Hence, 
comparative results in the future could well be 
skewed where the group balance was more kinaes- 
thetic. 

Completion time varied from 10 minutes to 30 
minutes with the majority being very close to the 
group average of 15 minutes. The figures squared 
well with the fact that decision style affects the 
speed of completion of a task: Judgemental types 
tend to complete tasks faster than Perceptual: Per- 
ceptive types take longer with a task being more 
curious than decisive and have the tendency to lose 
interest and not complete the task (see p. 76 in 
Myers & Myers, 1995). 

The totals of each type containing the J-type 
[ESFJ-ESTJ-ENSJ-ENTJ-ISFJ-ISTJ-INSJ-INTJ] 
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as compared with the P-type [ESFP-ESTJ-ENSP- 
ENTJ-ISFP-ISTJ-INSP-INTP] was: 

Judgement 7E13% Perceptual 28.87% 

Personality Types provided a split between ex- 
troverts (59.79%) and introverts (40.21%). 

Each type was rated from 0 to 5. The average 
rating for the dominant type, from a possible rating of 
3 to 5 presented as an average rating number, were: 



Interpersonal Communication 


E 3.55 


I 3.49 


Information Processing 


S 3.97 


N 3.23 


Information Evaluation 


T 3.49 


F 3.62 


Decision Style 


J 4.38 


P 3.48 



Evaluation Results 



The interactive group (IG) completed the interactive 
topic learning, then the interactive topic testing, and 
then the control topic testing sections; whereas, the 
control group (CG) completed the control topic 
learning, then the control topic testing, and then the 
interactive topic testing sections. This ensured that 
a set of comparative results were available. Due to 
the fact that the students were completing topic 
testing twice, it was anticipated that there would be 
an improvement in marks: There was for both with 
the IGs gaining more than the CGs. 

The intuitive section had 50 students log into the 
system of which 33 answered questions: 

• 33 completed the Interactive Group [IG] Multi- 
choice Q&A, 

• 27 completed the Control Group [CG] Multi- 
choice Q&A, 

• 27 completed both types. 

The average time taken for the evaluation/exer- 
cise was 94 minutes: varying from 50 min. to 140 min. 

Comparing the Marks for Both Sets of 
Students 

The Two-Sample T-test for Interactive and Con- 
trol Student Marks used a confidence level of 95% 
with a pooled StDev of 19.8. It produced a P-Value 
of 0.036. This P-Value indicates that H a is valid; 
however, due to the fact that the P-Value is not 



below 0.01, the degree of probability requires more 
sampling to harden: More research results need to 
be gathered and assessed to enable this section of 
the results analysis to be viewed as proof of the 
hypothesis, at this time, the results, provided a strong 
indication that the hypothesis is true. 

Comparing the Gains Made 
by Students 

The Two-Sample T-test Interactive and Control 
students Gain used a confidence level of 95% and 
a pooled StDev of 7.50. The results gave a P-Value 
of 0.005. This P-Value indicates that H is valid, in 
particular well below 0.01. This indicates that the 
degree of probability is very strong demonstrating 
probable improvement in memory retention and re- 
call: The results of the analysis provide a strong 
indication that the hypothesis is true. 

CONCLUSION 

The evaluation results indicated that the model imple- 
mented: 

• Is likely to make a significant improvement to 
student learning and remembering. 

• Produced more rehearsal from students than 
the control system and improved their marks. 

• Supported a general belief in the system, that it 
did indeed assist knowledge retention. This in 
itself is an important factor for the students’ 
psyche. 

• As compared with the control system, the 
interactive system held interest longer and was 
more capable of interacting at the student’s 
own level than the control system. 

The evaluation indicates that it does in fact aid 
memory retention and recall and, thus, remember- 
ing and learning, that the use of NLP language 
patterns can affect the way students recall in- 
stances. It also indicates that CP and LS used in 
HCI and established “before” the student starts 
using an e-learning system is an important message 
to take forward. 
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FUTURE TRENDS 

While communication preference, assessing per- 
sonality types, and the conscious observance of 
body language and reacting with these using 
neurolinguistic programming language patterns have 
been used since the mid 1970s very effectively and 
enhanced human communication and learning (re- 
membering) in the sales industry and management 
training (Bandler & Grinder, 1981), there has not 
been the reciprocal development in human-com- 
puter interaction development. Gustavsson (1994) 
and Johnson and Jones (1999) have indicated that 
subliminal messaging are affectively used in the 
advertising industry; once again, there has been no 
such reciprocal development in human-computer 
interaction. At this time, much research is looking at 
improving HCI interaction with the use of avatars 
(e.g., ADELE (Ganeshan, Johnson, Shaw, & Wood, 
2000; Shaw, Ganeshon, Johnson, & Millar, 2000)). 
The future development in HCI is likely to slowly 
encompass HI; however, the inclusion of the basic 
tenants of HI as demonstrated in these needs to be 
introduced earlier rather than later. 
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KEY TERMS 

Communication Preference: The selection of 
your own way in the art and technique of using 
words effectively to impart information or ideas. 

E-Learning (Online Learning): Learning us- 
ing electronic media. 

Human-Computer Interaction (HCI): The 

study of how humans interact with computers, and 
how to design computer systems that are usable, 
easy, quick, and productive for humans to use. 

Learning Styles: The sixteen styles made up 
out of from four couplets types: Extrovertllntrovert, 
SensingliNtuition, ThinkingIFeeling, and 
Perception! Judgement. 



Modality: 

• Auditory: Use of auditory imagery: hearing, 
tonality, pitch, melody, volume, and tempo. 

• Kinaesthetic: Use of emotional, feeling, move- 
ment imagery: intensity, temperature. 

• Visual: Use of visual imagery: sight, colour, 
brightness, contrast, focus, size, location, and 
movement. 

Neurolinguistic Programming (NLP): The 

study of the structure of subjective experience and 
what can be calculated from it. 

Neurolinguistic Programming Language 
Patterns: The use of the words, or similar con- 
structs, “See” for iconic, “Hear” for auditory, and 
“Feel” for kinaesthetic subjects both in language and 
text at the relevant times. 
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INTRODUCTION 

The introduction of computers is recreating a new 
criterion of differentiation between those who be- 
come integrated as a matter of course in the techno- 
cratic trend deriving from the daily use of these 
machines and those who become isolated by not 
using them. This difference increases when com- 
puter science and communications merge to intro- 
duce virtual education areas, where the conjunction 
of teacher and student in the space-time dimension 
is no longer an essential requirement and where the 
written text becomes replaced (or rather comple- 
mented) by the digital text (Garcia & Garcia, 2005). 

In order to rescue those educators who have 
much to offer in an educational system, whether 
virtual or presential, as authors of teaching re- 
sources, suitable authoring tools should be designed, 
thinking more in the pedagogical process than in the 
technical aspects. 

Hypertext Composer, or simply HyCo, is one of 
these authoring tools, which presents a pedagogical 
interaction model that makes easier the creation of 
educational resources for every teacher/author, in- 
dependently of his or her computer expertise level. 
At the same time, HyCo is an authoring tool and a 
retrieval tool, in that it encapsulates all the complex- 
ity in handling current tools within the facilities that 
the author needs and offers, as a result, a hypermedia 
teaching product that can be distributed in different 
formats for the user’s access. 

HyCo has an important semantic basis that nears 
this tool to the Semantic Web concept (Berners-Lee, 
Hendler & Lassila, 2001) and allows creating Se- 
mantic Learning Objects (SLO) that could be im- 



ported for more specialized Learning Management 
Systems (LMS). In order to achieve the semantic 
definition of the created educational resources, HyCo 
uses Learning Technology Standards or Specifica- 
tions (LTS), looking for obtaining contents that are 
able to work in other systems (interoperability), 
follow-up information about learners and contents 
(manageability), usability in other contexts (reus- 
ability), and avoiding obsolescence (durability). 

This article is devoted to introducing HyCo as an 
authoring/retrieval tool of SLOs, which presents an 
interaction model that hides all the technical com- 
plexity to the authors but, at same time, offers all the 
power of semantic definitions in order to publish or 
use the contents in advanced e-learning environ- 
ments. The rest of the article is organized as follows: 
the Background section establishes the background 
of the presented topic, making a comparison with 
related works; the HyCo Authoring Tool section 
presents the HyCo authoring tool; finally, the sec- 
tions Future Trends and Conclusion provide the 
future trends and the remarks of the article, respec- 
tively. 

BACKGROUND 

There are many different hypermedia authoring 
tools that could be used in order to produce 
hypermedia systems for the education domain. Some 
of them are commercial ones, whereas many others 
have been developed for educational and research 
goals. HyCo has no commercial ambitions for now, 
and we decided to develop our own solution in order 
to achieve our research goals, which include seman- 
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tic, adaptive, and collaborative issues; some of them 
are presented in the actual version, some are in 
working prototypes, and others will appear in future 
versions. 

First, it is important to say that FlyCo inherits 
properties from the two main trends in hypermedia 
systems: closed and open hypermedia systems. The 
former ones store both content and hypermedia 
structures internally (monolithic systems) or in a 
database. External application or information cannot 
participate easily or be included in the hypertext 
system. These systems produce self-contained 
hypermedia systems, but they do not support hetero- 
geneity and particularly do not support hypertext 
distributed over multiple heterogeneous managers, 
while the open hypermedia systems have the ability 
to integrate distributed information and the property 
to store their content outside the hyperbase, espe- 
cially keeping linking information separate from 
documents and allowing for more powerful link 
structures. 

HyCo presents a reader mode, in which the 
hypertext can be navigated within the tool in a self- 
contained way like in classic authoring systems such 
as IRIS Intermedia (Yankelovich, Flaan, Meyrowitz, 
& Drucker, 1988) or Storyspace (Bernstein, 1991, 

2002). These two systems are significant represen- 
tatives of the so-called closed hypermedia systems, 
which store both content and hypermedia structures 
internally (monolithic systems) or in a database. In 
addition, FlyCo has voice synthesis capabilities in 
order to make more accessible the developed hypertext 
system. The differentiation of the author and reader 
roles in the same authoring tool differs from other 
systems, which only present authoring capabilities 
as MS FrontPage (http://www.microsoft.com/ 
frontpage). 

About the use of external vs. internals links, 
FlyCo follows a compromise between these ap- 
proaches by storing links internally but representing 
them externally. Links are stored inside the educa- 
tional resource; in this way, users do not have 
separate link files that could cause wrong opening 
operations. But HyCo links are represented sepa- 
rately and compactly rather than being spread im- 
plicitly throughout the system. This idea is based on 
the link system of Storyspace v2 (Bernstein, 2002) 
and Chimera (Anderson, Taylor, & Whitehead, 2000) 
instead of the embedded link model of F1TML. 



Related to the semantic characteristics, a similar 
proposal can be found in HYLOS (Hypermedia 
Learning Object System) (Leustel & Schmidt, 2001). 
This system is devoted to creating ELearning Ob- 
jects (ELOs) instead of HTML pages. In this case, 
they complete the contents with its metadata to 
compound an ELO. The used metadata are a subset 
of the LOM (Learning Object Metadata) (IEEE, 

2003) instead of the IMS Metadata (IMS, 2003c) 
used in HyCo. 



s 



HyCo AUTHORING TOOL 

HyCo is a powerful authoring tool for educational 
purposes, which means that an author can create 
hypermedia educational resources with it. But the 
same tool also could be used to access created 
contents in a read-only mode by a student or reader. 

HyCo is a multiplatform tool — it does not force 
the use of one concrete platform. The idea is that if 
we want teachers to use it, they should work in the 
context in which they feel good. The actual version 
of HyCo works in the wider range of operating 
systems; for this reason, Java 2 Standard Edition 
technology (Sun, 2004) was chosen as a develop- 
ment base. 

The main goal of the authoring tool is the creation 
of educational contents while trying to achieve an 
independence of the final publication format. There 
exists a clear separation between the contents and 
its presentation. In this way, the educator writes the 
contents once and reuses them every time he or she 
needs them. In order to achieve this goal, HyCo tool 
uses an internal XML-based format (Bray et al., 

2004) to store the educational contents of the pro- 
duced electronic books. Precisely, the HyCo XML- 
based format allows the introduction of LTSs in this 
authoring tool; specifically, HyCo supports IMS 
specifications (IMS, 2003a, 2003b, 2003c) and EML 
(Educational Modeling Language) (Koper, 2001). 

Separating the content and the presentation forces 
offers authors a way to generate an independent 
result of the authoring tool. In this way, HyCo has an 
output gallery that supports HTML, PDF, TXT, 
RTF, SVG, and PS output formats. 

HyCo’s user interface has two main facilities 
that improve its usability. First, this authoring tool 
has an internationalized interface that actually sup- 
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ports two languages — Spanish and English. Its inter- 
face also allows voice synthesis, which permits the 
generation of an artificial voice that reads the con- 
tents to the user. This capability is very interesting in 
order to make presentations and also to facilitate 
access to the educational contents to handicapped 
people (e.g., blind people). 

However, more precisely, the HyCo authoring 
tool comprises three main components: an editor for 
linear educational resources, which provides an in- 
dexed tree structure; an editor for composite seman- 
tic learning objects by inserting values for the appro- 
priate metadata elements; and retrieval and manage- 
ment facilities for multimedia information of the 
learning resources. 

Editor for Educational Resources 

This editor is based on the content index metaphor 
that reproduces a hierarchical structure that guides 
us in our creative process. The following step is to 
associate contents with each index entry, an index 
that may vary as the contents take shape, by insert- 
ing, eliminating, or changing entries. Each index entry 
gives rise to a thematic unit, or lexia, that can contain 
text, multimedia material, and links with other units or 
documents. 

This indexed or tree structure facilitates the 
authoring of the hypertext, but having only an index 
as a navigation tool is not acceptable in order to 
create real pedagogical hypermedia resources where 
the student may construct his or her own knowledge. 
The hyperdocuments should be designed in such a 
way to encourage the readers to see the same text in 
as many useful contexts as possible. This means 
placing texts within the contexts of other texts, 
including different views of the same text (Jones & 
Spiro, 1992). 

For this reason, HyCo allows associating links to 
the multimedia elements that compose an index entry 
(i.e., a hypertext node). In this way, the hypertext can 
be followed by its index-structure, but when a node is 
selected, the reader may choose navigating by an 
existing link. Thus, HyCo documents combine both 
content index and Web-like structures. 

The content index metaphor is supported directly 
in the user interface, which is frame-structured. The 
left part of the screen shows links to every part of the 
hypertext structure, and the main frame is the writ- 



ing/displaying area. The Web metaphor is sup- 
ported by two buttons that allow creating or modi- 
fying the links. The main interface is completed 
with a toolbox area, which allows inserting, erasing, 
or renaming the entries of the structure, and with an 
information area at the lower-right corner, where 
the characteristics of the selected link (e.g., type of 
link, name, description) are displayed, as shown in 
Figure 1. 

Editor for Composite Semantic 
Learning Objects 

An SLO is a learning resource that is wrapped with 
a set of metadata and can be used in the instructional 
design process. In HyCo, every SLO should be 
compliant with IMS Metadata (IMS, 2003c). Then, 
every section of every educational resource or e- 
book created in HyCo can be converted to an SLO. 

To do this, HyCo executes a two-step process, 
where the first step is an automatic process and the 
second step is a manual process. In the automatic 

Figure 1. HyCo main interface 
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process, HyCo sets all the IMS metadata elements 
that can be inferred from other data or that are liable 
to have default values. 

Once this process is over, HyCo executes the 
manual process, where it presents to the user the 
elements that cannot be generated automatically 
and/or that require reexamination, modification, or 
addition (see Figure 2). 

When the two-step process is finished, an XML 
file is generated for each new SLO (each one of 
them corresponds to each educational resource, 
section, or subsection) and stored in an IMS metadata 
SLO repository. 

Retrieval and Management Facilities for 
Multimedia Information 

All the elements that the user has to manage and link 
to the text are organized in information repositories 
that are so-called galleries. These galleries present 
very intuitive interfaces to manage the concrete 
elements, images, or videos, for example. These 
galleries have thumbnails and descriptions of the 
elements but also offer simple search engines that 



allow the user to find the right element in the 
collection. The gallery metaphor was initially thought 
to manage the multimedia elements, but the success 
of this metaphor for users of HyCo has prompted the 
extension of this concept in order to manage other 
properties of the document, specifically the styles 
and the output formats. As an example of the gallery 
metaphor, Figure 3 shows the sound gallery inter- 
face. 



s 



FUTURE TRENDS 

The educational hypermedia systems have evolved 
clearly and unstoppably toward the Web. The needed 
authoring tools for creating them are now more 
mature; the notions of reusability and interoperability 
of semantic learning objects are presented in the 
most important e-learning management systems, but 
there are many weak points, too. The definition of 
educational standards, as IMS, is an important ad- 
vancement for the real reusability and interoperability 
of learning components, but these standards should 
be more present in the authoring process. Another 



Figure 2. HyCo and IMS metadata 
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Figure 3. Sound gallery 




important improvement area is the pedagogical model 
that authoring tools should support; it is compulsory 
that the learning components creation process will 
be guided by correct pedagogical guides. Adaptivity 
is other important key in success of the educational 
hypermedia systems; in order to achieve a personal- 
ized learning process, the author must have the 
adequate resources to define the rules that will guide 
the individualized learning. Finally, collaborative and 
cooperative authoring capabilities are interesting in 
all creational processes, and the case of the educa- 
tional hypermedia systems is not an exception. 

The FlyCo authoring tool presents characteris- 
tics related to the semantic and pedagogical topics 
introduced previously. The actual research and de- 
velopment efforts are directed to the adaptive and 
collaborative facilities. 



CONCLUSION 

In this article, we have introduced FlyCo as an 
authoring tool that allows the definition of both 
learning resources and learning components, or SLOs 
(i.e., semantic educational resources based on XML 
specifications), which could be delivered in diverse 
LMSs. Specifically, HyCo supports twoLTSs, EML, 
and IMS. The first one was our first attempt in this 
field, but it was exceeded by IMS. 



The success of the HyCo authoring process has 
been proved with three educational Web-based sys- 
tems. Two of them are drafts devoted to testing the 
authoring tool — one about computer history and the 
other about software engineering. But the third one 
is a complete electronic book in hypermedia format 
about cardiovascular surgery that consists of 14 
chapters, more than 500 sections, and over 1,000 
images. This book is successfully used in lectures of 
this subject at the University of Salamanca. Re- 
cently, we have made an SUS (System Usability 
Scale) (Brooke, 1996) usability test with the HyCo 
authoring tool. The obtained results were 83.495 
over 100 with a population of 14 users that are not 
related to the HyCo development team. 
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Educational Modeling Language (EML): 

Developed by the Open University of the Nether- 
lands, since 1998, EML is a notational method for e- 
learning environments based on a pedagogical meta- 
model that considers that didactic design plays a 
main role. 

Hypermedia: The style of building systems for 
information representation and management around 
the network of multimedia nodes connected together 
by typed links. 

Hypermedia Authoring T ools: Authoring tools 
for hypermedia systems are meant to provide envi- 
ronments where authors may create their own 
hypermedia systems in varying domains. 

Hypertext: A body of written or pictorial mate- 
rial interconnected in such a complex way that it 
could not conveniently be presented or represented 
on paper (Theodor H. Nelson). 

IMS Specifications: IMS was born in 1997 as 
a project of the National Learning Infrastructure 
Initiative at Educause. In 2000, it became a non- 
profit organization. Its mission is to promote distrib- 
uted learning environments. 
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Learning Technology Standards or Specifi- 
cations (LTS): Agreements about the characteris- 
tics that a learning element should have in order to be 
compatible, interchangeable, and interoperable into 
other learning systems. The use of standards en- 
sures instructional technologies’ interoperability and 
their learning objects for universities and corpora- 
tions around the globe. Examples of LTS are IMS, 
EML (Educational Modeling Language), and LTSC 
IEEE LOM (Learning Technology Standard Com- 
mittee of the IEEE — Learning Object Metadata). 

Metadata: Information about data or other in- 
formation. 

Metaphor: An understandable mental image of 
real objects. The knowledge and the relationships 



among elements in a known domain are translated to 
a non-familiar domain. 

Semantic Learning Object: A learning re- 
source that is wrapped with a set of standardized 
metadata and can be used in the instructional design 
process. 

Semantic Web: A Web that includes documents 
or portions of documents describing explicit relation- 
ships among things and containing semantic infor- 
mation intended for automatic processing by our 
machines. 

Voice Synthesis: The process that allows the 
transformation of the text to sound. 
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INTRODUCTION 

Sense of presence is one of the most interesting 
phenomena that enriches users’ experiences of in- 
teracting with any type of system. It allows users to 
be there (Schloerb & Sheridan, 1995) and to per- 
ceive the virtual world as another world in which 
they really exist. 

The interest in presence phenomenon is not novel 
(Gerrig, 1993), but it has grown lately due to the 
advent of virtual reality (VR) technology. The spe- 
cific characteristics of virtual environments (VEs) 
transform them into suitable experimental testbeds 
for studies in various research areas. This also 
resuscitated the interest in presence, and much work 
has focused on the development of a theoretical 
body of knowledge and on a whole set of experimen- 
tal studies aimed at understanding, explaining, mea- 
suring, or predicting presence. All of these efforts 
have been made to increase the understanding of 
how presence can be manipulated within the VEs, 
particularly within the application areas where pres- 
ence potential has been acknowledged. 

Probably one of the most important reasons 
motivating presence research is the relationship it 
holds with task performance. This debatable rela- 
tionship together with the more obvious one between 
presence and user satisfaction suggest that pres- 
ence may play an important role in the perceived 
system usability. 

Since presence may act as a catalyst for the 
learning potential of VEs, it can be harnessed for the 
training and transfer of skills (Mantovani & 
Castelnuovo, 1998;Schank, 1997). The potential of 
presence to increase the pervasive power of the 
delivered content motivates research on presence 
impact on e-marketing and advertising (Grigorovici, 
2003). Another promising application area for pres- 
ence research is within the realm of cognitive therapy 
of phobias (Strickland et al., 1997). 

The highly subjective nature of presence contin- 
ues to challenge researchers to find appropriate 



methodologies and instruments for measuring it. 
This is reflected in the ongoing theoretical work of 
conceptualizing a sense of presence. The difficulties 
related to investigating presence led to a large set of 
definitions and measuring tools. 

The purpose of this article is to introduce the 
concept of presence. The first section offers some 
conceptual delimitations related to presence con- 
struct. The second section describes its main deter- 
minants along two dimensions (i.e. , technological 
factors and human factors). The third section ad- 
dresses the challenges of measuring presence, of- 
fering also an overview of the main methods, tools, 
and instruments developed for assessing it. The 
fourth section presents the complex relationship 
between presence and task performance. 

BACKGROUND 

Attempts to define presence have been numerous, 
and the lack of a unanimously accepted definition 
suggests the multi-dimensional nature of this con- 
struct and its not yet mature understanding. 

Presence has been described as a sense of being 
physically present at the remote site (Schloerb & 
Sheridan, 1995; Sheridan, 1992), a basic state of 
consciousness consisting of the attribution of sensa- 
tion to some distal stimuli (Loomis, 1992), a suspen- 
sion of disbelief experienced by users while being in 
a remote world and not the physical one (Slater & 
Usoh, 1993), or the perceptual illusion of non-media- 
tion (Lombard & Ditton, 1997). After analyzing 
various presence definitions, we proposed the fol- 
lowing one (Sas & O’ Hare, 2001, 2003): 

Presence is a psychological phenomenon, through 
which one’s cognitive processes are oriented 
toward another world, either technologically 
mediated or imaginary, to such an extent that he 
or she experiences mentally the state of being 
(there), similar to one in the physical reality. 
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together with an imperceptible shifting of focus 
of consciousness to the proximal stimulus located 
in that other world. 

Any attempt to conceptualize a construct also 
should consider its discriminant validity by contrast- 
ing it with other close concepts in the field. Further- 
more, three other constructs — telepresence, immer- 
sion, and flow — are introduced, and their relation- 
ships with presence are outlined briefly. 

The term of telepresence was coined by Marvin 
Minsky (1980), emphasizes the meaning of media- 
tion, and denotes a sense of being physically present 
at a remote world. Draper, et al. ( 1998) defined it as 
the perception of presence within a remote environ- 
ment. This concept precedes and is closely related to 
the presence construct. Despite often being taken as 
synonyms, there is, however, a subtle difference 
between presence and telepresence, rooted in the 
proximity to the site where one perceives, acts, and 
ultimately experiences presence. 

Another distinction often mentioned in presence 
literature is that between presence and immersion. 
Immersion is usually associated with technological 
factors referring to the extent to which computer 
generated worlds are extensive (able to accommo- 
date a large set of sensory systems), surrounding 
(able to provide information from any virtual direc- 
tion), inclusive (able to shut out all information from 
the physical world), vivid (able to provide rich infor- 
mation content, resolution, and display quality), and 
matching (able to accurately reproduce the body 
movements previously tracked) (Slater et al., 1995, 
1996). In contrast, presence relates more to user 
characteristics, whose impact is unfortunately less 
explored. 

The last useful distinction is the one between 
presence and flow, defined as a state of optimal 
experience that occurs when people attempt tasks 
that challenge their skills (Csikszentmihalyi, 1990). 
Flow assumes a match between the task difficulty 
and one’s abilities, highly focused attention that 
leads to enjoyment, feeling of control, and an altered 
perception of time. From this, several distinctions 
emerge with respect to both the experience itself 
and its results. The experience in the case of flow, 
as opposed to presence, always requires intense 
concentration and focus of attention, a sense of 
control, and usually an intense and active participa- 



tion in the task, usually perceived more narrowly 
through only some of its characteristics (Fontaine, 
1992). With respect to the results, since presence is 
not an optimal experience, it does not necessarily 
lead to pleasant and fulfilling experience. Flowever, 
it is possible that during flow, someone will experi- 
ence a strong sense of presence, but the latter also 
can occur outside the flow (Heeter, 2003). 

Despite the diversity characterizing the defini- 
tions proposed for capturing the presence construct, 
there seems to be a common ground shared by 
researchers in the presence field, which refers to 
presence determinants. 

PRESENCE DETERMINANTS 

Several presence theories have been developed in 
the attempt to extend the understanding of presence. 
Draper (1998) identified a first group consisting of 
psychological models of presence and a second one 
consisting of technological models of presence. The 
first class of theories includes telepresence as flow 
experience developed by Csikszentmihalyi (1990), 
behavioral cybernetics theory (Smith & Smith, 1985), 
and a structured attentional resource model for 
teleoperation (Schloerb & Sheridan, 1995). The 
second class of theories groups different models, 
such as those elaborated by Sheridan (1992), Steuer 
(1992), Schloerb (1995), Zeltzer ( 1992), Witmer and 
Singer (1998), and Slater and Usoh (1993). 

The factors affecting presence can be grouped 
into technological factors that consider the system 
and its characteristics, and human factors referring 
to users’ cognitive and personality aspects (Lombard 
& Ditton, 1997; Lessiter et al., 2000). 

Technological Factors 

A large amount of work has been carried out in the 
area of technological factors affecting presence. 
Lombard and Ditton (1997) provided a detailed 
account of this. Some of these factors are visual 
display characteristics such as image quality; image 
size; viewing distance; visual angle; motion; color; 
dimensionality; camera techniques; and aural pre- 
sentation characteristics such as frequency range, 
dynamic range, signal to noise ratio, and high quality 
audio. As stimuli for other senses, Lombard and 
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Ditton (1997) referred to olfactory output, body move- 
ment, tactile stimuli, and force feedback. 

Media and user characteristics often were men- 
tioned as having a particular impact on the level of 
sense of presence experienced by the users. How- 
ever, there is little empirical research supporting this 
(Lessiter et al., 2000). 

Human Factors 

Psotka and Davison (1993) considered two catego- 
ries of factors determinant of immersion, such as 
susceptibility to immersion and quality of immersion. 
The first set refers to user characteristics with an 
emphasis on cognitive aspects such as imagination, 
vivid imagery, concentration, attention, and self-con- 
trol, while the second set is concerned primarily with 
technological factors like affordances of VR, dis- 
tractions from the real world, and physiological ef- 
fects. 

Kaber, Draper, and Usher (2002) summarized 
user characteristics that seem to impact presence 
experienced within VEs. Broadly categorized in 
immersive tendencies and attention, these factors 
are suggestibility of immersion, tendency to day- 
dream, becoming lost in novels, concentration, and 
robustness to distracting events. 

Other personality factors impacting on presence 
are empathy, absorption, creative imagination, per- 
sonality, cognitive style, and willingness to be trans- 
ported in the VE (Heeter, 1992; Lombard & Ditton, 
1997; Sas & O’Hare, 2001, 2003; Sas et al., 2003). 

Conceptualizing presence is the initial stage of 
understanding this construct. It has been followed by 
the attempts of measuring presence. Different meth- 
ods and measurement instruments have been pro- 
posed for offering quantitative indicators of the de- 
gree of presence that one can experience. 



MEASURING PRESENCE 

Despite its significance, measuring presence raises 
significant challenges, primarily related to the nature 
of presence. Presence is a psychological phenom- 
enon, subjectively experienced inside the inner world 
of one’s consciousness. Therefore, capturing and 
analyzing it requires a certain degree of introspec- 
tion, together with one’ s understanding of what pres- 



ence means. In addition, presence is a state or a 
transient psychological condition that is context- 
dependent and that, accordingly, could vary within 
the same individual during an experiment. 

Therefore, participants could encounter difficul- 
ties in assessing their level of presence after the 
task has been completed and the experiment has 
ended. Even more difficult is measuring presence 
during the experiment. This involves asking some- 
body to be permanently aware of each change 
occurring in his or her level of presence. Such a 
requirement adds itself to those involved in the 
execution of the task, therefore inducing cognitive 
overload. This either could prevent the subjects 
from experiencing presence or could affect the task 
performance. Either case impacts on the measure- 
ment validity. 

Another difficulty in measuring presence is re- 
lated to the complexity and multi-dimensionality of 
this construct (Lombard, 2003), which is reflected 
in the different definitions and theories trying to 
explain presence. In addition, presence research 
seems to be an interdisciplinary field that benefits 
from inputs from various disciplines, such as psy- 
chology, philosophy, computer science, media stud- 
ies or drama studies to enumerate the most impor- 
tant ones. These multiple perspectives provide valu- 
able insights into understanding presence, but at the 
same time they come at a cost. A fully articulated 
and accepted theory of presence requires a com- 
mon understanding of presence. 

Lombard (2003) identified two general ap- 
proaches to measuring presence: subjective mea- 
surements and objective measurements. Subjective 
measurements usually consist of self-rating ques- 
tionnaires that require participants to evaluate the 
experienced level of presence. The main advantage 
of the subjective measurements consists of their 
accessibility. They also come at a low cost, and 
very important, appear to be valid and reliable 
measures (Lombard, 2003; Prothero et al., 1995). 
Such questionnaires have been developed by 
Lessiter et al. (2000), Lombard (2000), Schubert 
(1999), Witmer (1998), and Slater et al. (2000). 

Limitations of this approach are related mainly 
to the inner and versatile nature of presence and to 
the level of introspection that participants are as- 
sumed to be able to achieve. Such types of informa- 
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tion could be elicited post-experiment or during the 
experiment. 

In order to overcome some of the limitations 
related to subjective measures of presence, another 
approach started to emerge. At the core of objective 
measurements lies the hypothesis that, while users 
experience presence, a series of physiological and 
behavioral modifications occurred in their bodies. 
The particular physiological modifications that were 
considered to reflect presence were skin conduc- 
tance, blood pressure, heart rate, muscle tension, 
respiration, eye movement, posture, and so forth 
(Lombard, 2003). 

These measures involve the recording of such 
modifications in real time and present the consider- 
able advantage of being unobtrusive. They also can 
be carried out without requiring subjects’ involve- 
ments in these measurements. The objective mea- 
surements have their own limitations, such as high 
cost and difficulties in administrating them. How- 
ever, their main drawback concerns the limited 
evidences of the fact that physiological modifica- 
tions correlate with presence (Prothero et al., 1995). 

Another aspect of major interest regarding pres- 
ence is its relationship to task performance. The 
significance of this relationship justifies the efforts 
invested in defining and measuring presence. At the 
same time, this issue has generated serious theoreti- 
cal treatments and empirical investigations. 

PRESENCE AND 
TASK PERFORMANCE 

The existence of a relationship between presence 
and task performance is arguable and has given rise 
to a long-standing debate in the presence research 
area. More empirical studies are required in order to 
refute or support this dependency. Theoretical work 
and empirical studies have highlighted two possible 
research positions. The first position states that 
presence is merely an epiphenomenon (Ellis, 1996; 
Welch et al., 1996), and consequently, its impact 
upon task performance is limited. According to this 
position, the role of presence consists only of 
affectively coloring the user’s experience. The sec- 
ond position argues that presence impacts on the 
performance of tasks carried out within the VEs. 
There are two perspectives on this position. 



The first one views it as a mediated relationship. 
In other words, presence and task performance 
could be related, in fact, to a third extraneous 
variable or set of variables (Slater et al., 1996; 
Stanney et al., 1998) that impacts both presence and 
task performance. These extraneous variables were 
considered to be related to the technological aspects 
of VEs, such as improved VEs (Stanney et al., 1998) 
or immersion (Slater et al., 1996). 

The second and probably most important expla- 
nation of this dependency between presence and 
task performance argues for a causal relationship 
(Sadowski & Stanney, 2002). This perspective has 
fueled most of the research in the field. However, 
the issue of causal relationship presents a twofold 
problem. First, it is a challenge to design an experi- 
ment for highlighting the causal relationship, and this 
relationship, if it exists, would seem to be highly task- 
dependent (Slater et al., 1996; Stanney, 1995). 

The significance of the content being delivered 
through any mediated experience has been related to 
the nature of activity or tasks in which the user 
participates, which, in turn, seems to impact pres- 
ence (Lombard & Ditton, 1997). Heeter (1992) 
distinguished between two potential groups of tasks 
that could impact on presence differently and are 
related to two fundamental types of activity: learning 
and playing. Particularly in the case of tasks involv- 
ing a ludic component, the sense of presence is likely 
correlated with enjoyment, which, in turn, is likely 
correlated with task performance (Barfield et al., 

1995) . Tasks or activities that involve ambiguous 
verbal and nonverbal social cues and sensitive per- 
sonal information better exploits the medium’s po- 
tential to offer presence than do simple nonpersonal 
tasks (Lombard & Ditton, 1997). Correlations be- 
tween performance improvement and presence ap- 
pear to be positive. However, they are usually weak, 
since less than 10% of variance in the performance 
seems to account for perceived presence (Snow, 

1996) . 

Despite this limitation, the causal relationship of 
presence and task performance has increased face 
validity based on the perceptual and cognitive psy- 
chology of skills transfer (Stanney et al., 1998). In 
this light, an additional benefit of understanding this 
relationship consists of the transfer of skills from the 
VE to the real world. Slater et al. (1996) considered 
presence merely as a facilitator whose main contri- 



514 



Sense of Presence 



bution consists of enabling the user to perform 
naturally in a way similar to the real world or, in other 
words, inducing one’s natural reactions. 



which, in turn, can contribute to increased satisfac- 
tion. The latter two aspects (performance and satis- 
faction) suggest the impact that presence may have 
on the perceived usability of a system. 
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FUTURE TRENDS 

The article highlights the uneven interest manifested 
in this research area; that particularly favors techno- 
logical factors. For this, it advocates a shift of 
interest that would motivate studies focusing prima- 
rily on user characteristics (e.g., personality or 
cognitive factors rather than bodily-related aspects). 
Indeed, it appears that almost half of the variance in 
sense of presence is covered by personality factors 
(Sas & O’Flare, 2003). 

The efforts invested for bridging this gap could be 
efficiently exploited for the development of hybrid 
theories. Such theories can provide a comprehen- 
sive explanation of presence by focusing simulta- 
neously on technological and human factors and on 
the relationship between them. 

CONCLUSION 

This article introduces the presence construct, of- 
fering at the same time a review of presence deter- 
minants. Presence determinants are organized along 
two fundamental groups such as technological and 
human factors. Flowever, these two groups of fac- 
tors impactin on presence and, taken as a basis for 
grouping presence theories, should be seen on a 
continuum rather than as a dichotomy. Both human 
and technological factors should be seen as part of 
a wider equation whose addressing increases the 
potential of understanding and possibly manipulating 
presence. 

The inner nature of this phenomenon poses a 
series of serious problems for investigating it at both 
theoretical and empirical levels. The article outlines 
the methods and instruments developed for assess- 
ing presence, with an emphasis on the challenges 
and difficulties of measuring. 

Apart from the application areas that harnessed 
its potential, the interest in presence also is sup- 
ported by the frequency of this phenomenon, the 
relationship it holds with task performance, and its 
likelihood to effectively color a user’s experience, 
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KEY TERMS 

Flow: A psychological state experienced when 
there is a match between task requirements and 
user’s skills, a state that involves high attention and 
leads to feelings of control and enjoyment. 

Human Factors: User characteristics in terms 
of personality and cognitive factors that impact the 
task performance and the quality of interaction with 
any artifact. 

Immersion: A quality of a system, usually com- 
puter-generated world consisting of a set of techno- 
logical factors that enable users to experience the 
virtual world vividly and exclusively. 

Presence: A psychological phenomenon en- 
abling the mental state of being there in either 
technologically mediated or imaginary spaces. 

Task Performance: The proficiency of accom- 
plishing a task that allows discriminating the users 
(i.e., experts, novices). 

Technological Factors: Aspects characteriz- 
ing a technical system (i.e., computer-generated 
world) and its components that impact the quality of 
interaction and task performance. 

Telepresence: A psychological phenomenon of 
being mentally present at a technologically mediated 
remote world. 
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INTRODUCTION 

Hypertexts are electronic presentations of informa- 
tion comprised of any number of documents con- 
nected by electronic links that allow users to move 
between them with a mouse click. In addition to text, 
the documents also may contain pictures, videos, 
demonstrations, or sound resources. With the addi- 
tion of such media, hypertext often is referred to as 
hypermedia. A hypertext can present information 
contained in a college course or the products offered 
by a cleaning supply company. A hypertext can 
contain as little as two documents or as much as the 
holdings of an entire library. Because hypertexts can 
be quite large, site maps often are used to provide 
users with an overview of a site’s content and 
structure. While they may appear as simple tables of 
content, they also can provide a graphical represen- 
tation of the site’s documents and even the network 
of links connecting them. Regardless of the form a 
site map takes, it may appear as a simple overview 
or, more commonly, as an interactive tool in which 
each entry serves as a link to the page it represents. 
Site maps may appear on a hypertext homepage or 
on a separate page, often as a help menu option. 

BACKGROUND 

Indexes and tables of contents, the precursors to site 
maps, have been in use for hundreds of years. Well 
before anyone envisioned a technology as remark- 
able as hypertext, readers were using tables of 
contents and indexes to glean summaries of printed 
texts and to find specific pieces of information. 
Modern psychologists and educational theorists be- 
came interested in such devices as educational tools 
and have published a number of formal studies on 
that topic. The overriding conclusion drawn by that 
body of research is that outlines, indexes, and tables 
of content, called advance organizers in the psychol- 
ogy literature, can augment what is learned from 



traditional text (Glover & Krug, 1988; Kraiger, 
Salas, & Cannon-Bowers, 1995; Snapp & Glover, 
1990; Townsend & Clarihew, 1989). One reason 
advance organizers work is that they cue the reader 
to access existing memories that may help to orga- 
nize or anchor new information from the text (Mayer, 
1979). 

It also has been shown, though, that advance 
organizers augment learning on the part of domain 
novices, who have little stored knowledge from 
which to draw. In this situation, advance organizers 
appear to work, because they provide a structure in 
which to organize new information (Townsend & 
Clarihew, 1 989). For example, Mannes, and Kintsch 
(1987) gave novice learners a text accompanied by 
an advance organizer that either mirrored the struc- 
ture of the text or provided a different organization. 
When tested for both recall and recognition of the 
text content, those in the compatible condition out- 
performed those in the mismatched condition. While 
these results indicate that the compatible advance 
organizer provided a structure in which the text 
content was more easily stored, it should be noted 
that those in the inconsistent condition performed 
better on a problem-solving posttest. The authors 
argue that the additional work required to create an 
organized understanding of the text in the face of an 
inconsistent organizer may have resulted in deeper 
understanding, which is reflected in the problem- 
solving task. 

With such evidence pointing to the educational 
benefits of advance organizers, psychologists natu- 
rally became interested in the potential of their 
electronic cousins — site maps — to promote 
hypertext-based learning. Much of the work on 
advance organizers has transferred well to learning 
with site maps. Additionally, site maps have been 
used to remedy the problem of getting lost in the 
information space, a problem not generally encoun- 
tered with traditional text. The following sections 
summarize what is known about the use of site maps 
for staying oriented and for augmenting learning. 
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Site Maps for Staying Oriented and 
Finding Information 

When working in a large hypertext, it is not uncom- 
mon for users to find themselves lost or disoriented. 
When this happens, users are pulled away from their 
primary task, whether that be searching for a spe- 
cific piece of information or learning the global 
content. This experience commonly is referred to as 
being lost in hyperspace. The danger posed by that 
cognitive state is that users become so focused on 
finding their way through the system that they are 
unable to achieve their intended goals. Site maps 
keep users oriented by providing them with a view of 
the system contents as a whole. The effect is similar 
to the familiar you-are-here maps often available at 
museums or large shopping malls. It is well accepted 
that site maps are effective for remedying the 
experience of getting lost and allowing users to find 
more quickly and more easily their way and return to 
their primary goals (Hammond & Allinson, 1989; 
Monesson, 2002). Indeed, in a review of studies 
exploring the effectiveness of educational hypertext, 
Chen and Rada (1996) conclude that site maps 
“appear to be necessary for users dealing with large 
and complex information structures and to be useful 
to resolve the problems of disorientation and high 
cognitive overhead” (p. 149). The more accurately 
the site map represents the hypertext structure and 
content, the more useful it will be in orienting lost 
explorers. 

Site maps also serve a similar purpose to a 
traditional table of contents in that they inform users 
about the topics represented on the site. Another 
purpose of site maps results as a consequence of 
those already described here. Specifically, once 
users are aware of where they are and what infor- 
mation is contained in the hypertext, a site map can 
help users to find their way to a desired location on 
the site. That is, they can enable users in planning a 
best route to a given page. Indeed, site maps have 
been shown to alter learners’ search performances 
and browsing behaviors (Chou, Lin, & Sun, 2000; 
McDonald & Stevenson, 1999; Monesson, 2002; 
Puntambekar, Stylianou, & Htibscher, 2003). 



Site Maps for Learning 

While the ability of site maps to orient users is fairly 
clear, their effectiveness as learning tools is less 
certain. Specifically, the research on learning out- 
comes using site maps paints a picture of a tool that, 
at first glance, appears unpredictable. Some studies 
conclude that there is no educational benefit of using 
site maps. For example, Wenger, and Payne (1994) 
found that learners using site maps increased the 
amount of a hypertext they visited but observed no 
accompanying increase in learning outcomes. Oth- 
ers, such as Neiderhauser, Reynolds, Salmen, and 
Skolmoski (2000) initially found some effect of site 
maps on learning, only to determine through regres- 
sion analyses that the impact was minimal. Like- 
wise, Nilsson, and Mayer (2002) have shown that 
the effect of a map can be contingent on user 
characteristics such as spatial ability. 

Other studies, however, have found more signifi- 
cant educational benefit from site map use but that 
benefit only has been observed for domain novices 
(Potelle & Rouet, 2003; Puntambekar, Stylianou, & 
Hubscher, 2003). For example, Potelle, and Rouet 
(2003) provided novice and more advanced learners 
with a hypertext accompanied by one of three site 
maps, either a hierarchy, a graphical network repre- 
senting the system’s nodes and links, or an alphabeti- 
cal index. No differences in the advanced learners’ 
performances were detected among site map condi- 
tions. The novices, however, performed best on 
learning posttests when they used the hierarchical 
map. The authors concluded that the novices ben- 
efited from the clear structure and transparent orga- 
nization of the site map. It may be the case that the 
site map’s structure allowed the novices to create a 
similarly organized mental model for the material 
that enhanced their understanding. Since domain 
experts are understood to possess well organized 
knowledge structures (Chase & Simon, 1973; Chi & 
Koeske, 1983; West & Pines, 1985), it is under- 
standable that the more advanced learners did not 
benefit from the hierarchical site map. These learn- 
ers were less likely to require the outside organizer. 

Shapiro (1998) also found that prior knowledge is 
a mediating factor in determining whether users 
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benefit from site maps. She presented learners with 
a hypertext about animals and their ecosystems. 
While the hypertext was identical between condi- 
tions, subjects were assigned to hierarchical site map 
conditions that represented the system as structured 
either by ecosystems or by animal families. All 
subjects were pretested for their knowledge of the 
topic to ensure that they had moderate to high knowl- 
edge of animal families but poor knowledge of how 
ecosystems function. All subjects performed equiva- 
lently on the animal families posttest, regardless of 
which site map they used. Those who used the 
ecosystems site map, however, outperformed their 
counterparts on the ecosystems posttest questions. 
As in Potelle and Rouet (2003), then, these results 
indicate that site maps benefit learners primarily 
when they lack sufficient prior knowledge about a 
topic. 

Prior knowledge is not the only variable that 
determines the educational effectiveness of site maps. 
Their compatibility with learners’ goals is also impor- 
tant. This point is demonstrated in a series of studies 
by Dee-Lucas and Larkin (1995). When subjects in 
their first study were given no specific learning goal, 
all learned equivalently regardless of whether they 
read a hypertext equipped with a hierarchical site 
map or an alphabetical index (a third control group 
read the same information as traditional text but 
performed less well than both hypertext groups). 
When subjects in a second experiment were given 
the explicit goal of summarizing the hypertext’s in- 
formation, those exposed to the hierarchical site map 
outperformed the other subject groups. The act of 
summarizing a hypertext requires that one under- 
stand the relationships between ideas and the content 
of each page. Because hierarchies define relation- 
ships between topics, it is understandable that the 
ability to summarize would be enhanced by exposure 
to a hierarchical site map. In sum, when the site 
map’s structure matched the learners’ goal, the 
learning goal was better achieved. 

FUTURE TRENDS 

At the present time, there is good reason to use site 
maps for the purpose of staying oriented and, in the 
case of beginning learners, for promoting good learn- 



ing outcomes. As the World Wide Web becomes 
more commonplace in the everyday functioning of 
the classroom, site maps should prove to be impor- 
tant components of educational sites. 

A great deal of research on site maps is required 
before their utility will be fully understood and can 
be capitalized on. Once the relationship among site 
maps’ structures, features of the hypertext they 
represent, and user characteristics is understood, 
designers will be ready to equip hypertexts with 
adaptive site map modules. With adaptive modules, 
site maps can be generated automatically and con- 
figured to best meet the needs of any given user. To 
take advantage of adaptive site maps, educators or 
users themselves may be able to enter information 
about users’ knowledge states, their goals, and 
other relevant information. The system then will 
generate a site map tailored for that specific con- 
text. It even may be possible to enter in a learning 
goal and have the system generate a site map that 
illustrates a path through the system tailored to that 
goal. Much as drivers are able to call up a recom- 
mended route using services such as 
www.MapQuest.com, learners may one day be 
able to call up recommended routes through large 
databases (i.e., the U.S. Library of Congress) to 
gain an understanding of a topic or domain. 

CONCLUSION 

Site maps are useful for keeping users oriented in a 
hypertext, informing them about the nature of the 
site ’ s content and helping them to determine how to 
find their way to a desired page or topic. Site maps 
have not been found to enhance learning outcomes 
for users already knowledgeable in a domain, al- 
though that question is somewhat understudied. Site 
maps have been found to enhance learning for 
domain novices. Their effectiveness for this group, 
however, also is contingent on how well the struc- 
ture of the site map coheres with a user’s learning 
goals. Much more research is needed, however, to 
understand fully how to capitalize best on site maps 
for educational purposes. Specifically, much re- 
mains to be learned about the factors mediating 
their ability to enhance learning. 
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KEY TERMS 

Advance Organizer: Any presentation of infor- 
mation that displays and represents the content and 
structure of a hypertext or text. 

Hypertext: A collection of electronic texts con- 
nected through electronic links. In addition to text, 
the documents also may contain pictures, videos, 
demonstrations, or sound resources. With the addi- 
tion of such media, hypertext often is referred to as 
hypermedia. 



Mental Model: The content and structure of an 
individual’ s knowledge. 

Prior Knowledge: An individual’s collected 
store of knowledge prior to exposure to a hypertext 
or other body of information. 

Site Map: An electronic representation of the 
documents in a hypertext and sometimes the links 
connecting them. Site maps may appear as simple 
overviews or as interactive tools in which each entry 
serves as a link to the page it represents. 
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INTRODUCTION 

Designing an attractive user interface for Internet 
communication is the objective of every software 
developer. However, it is not an easy task as the 
interface will be accessed by an uncertain number of 
users with various purposes. To interact with users, 
text, sounds, images, and animations can be provided 
according to different situations. Originally, text was 
the only medium available for a user to communicate 
over the Internet. With technology development, 
multimedia channels (e.g., video and audio) emerged 
into the online context. 

Individuals’ sociability may influence human 
behaviour. Some people prefer a quiet environment 
and others enjoy more liveliness. On the other hand, 
the activity purpose influences the environment pref- 
erence as well. Following usability principles and 
task analysis (Badre, 2002; Cato, 2001; Dix, Finlay, 
Abowd, & Beale, 1998; McCraken & Wolfe, 2004; 
Neilsen, 2000; Nielsen & Tahir, 2002; Preece, 
Rogers, & Sharp, 2002), we can predict that busi- 
ness-oriented systems and informal systems will 
require different types of interfaces: Business sys- 
tems are concerned with the efficiency of perform- 
ing tasks, while the effectiveness of informal sys- 
tems depend more on the user’ s satisfaction with the 
experience of interacting with the system. 

Suppose you are an Internet application de- 
signer; should you provide a vivid and multichannel 
interface or a concise and clear appearance? When 
individuals ’ sociability and the activity purpose con- 
tradict, should the interface design follow the socia- 



bility requirement, the purpose of the activity, or 
even neither of them? 

To answer these questions, the characteristics of 
communication interfaces should be examined. For 
face-to-face communications, sounds, voices, vari- 
ous facial expressions, and physical movements are 
the most important contributing factors. These fea- 
tures are named physical and social presence (Loomis, 
Golledge, & Klatzky, 1998). 

In the virtual world, real physical presence does 
not exist anymore; however, emotional feelings, 
group feelings, and other social feelings are existent 
but vary in quantity. The essential differences of 
interfaces are the quantity of the presented social 
feelings. For example, a three-dimensional (3-D) 
interface may provide more geographical and social 
feelings than a two-dimensional (2-D) chat room 
may present. 

To assess the different feelings that may emerge 
from different interfaces, a two-dimensional chat 
room and a three-dimensional chatting environment 
were developed. The identification of social feelings 
present in the different interface styles is presented 
first. Then an experiment that was carried out to 
measure the influence the activity styles and the 
individuals’ sociability have on the interface prefer- 
ences is discussed. 

The questions raised in this article are “What are 
the social feelings that may differ between the two 
interfaces (2-D vs. 3-D)?” and “Will users prefer 
different interfaces for different types of activi- 
ties?” 
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Table 1. Different activities 



Business Oriented 


Social Oriented 


Do math homework 


Take a break from work 


Schedule technical 
meetings 


Fill up free time 


Seek technical advice 


Gossip and chat 



BACKGROUND 

Graphically, Internet communication interfaces can 
be classified into two categories: two dimensional 
and three dimensional. A 2-D interface is an accept- 
able choice for our flat monitor. 3-D interfaces apply 
various graphical algorithms to simulate the sense of 
depth in 2-D interfaces; hence, most 3-D interfaces 
can be defined as 2.5-D. In this article, the 3-D 
interfaces mentioned below can actually be classi- 
fied into 2.5-D. 

Social Presence 

Communication channels are vivid in face-to-face 
communication. Physical movement, facial expres- 
sions, and variations of sound create the diversity. 
Computers and the Internet cannot provide the 
physical presence of users. Instead, people feel that 
they are chatting directly with other users. This is 
called social presence. 

Social presence is defined as the “degree of 
salience of the other person in the interaction and the 
consequent salience (and perceived intimacy and 
immediacy) of the interpersonal relationships” (Short, 
Williams, & Christie, 1976, p. 65). 

Communication researchers (Bailenson, 
Blascovich, Beall, & Loomis, 2001; Short et al., 
1976) argue that even in a text-dominated environ- 
ment, social presence still exists and provides impor- 
tant functions. 

Interfaces with rich or poor communication chan- 
nels may lead to different amounts of perceived 
social feelings. Witmer and Singer (1998) discussed 
some factors influencing social presence. These 
factors include the degree of control, environmental 
richness, multimodal presentation, scene realism, 
immediacy of control, anticipation, mode of control, 



physical modifiability, sensory modality, degree of 
movement perception, active search, isolation, se- 
lective attention, interface awareness, and meaning- 
fulness of the experience. 

With social-presence theory, different interfaces 
can be classified and assessed by the amount of 
social feelings presented. 

Human Sociability Style 

Sociability is defined as the quality or state of being 
sociable. The Merriam-Webster online dictionary 
(1996) defines sociable as the inclination by nature 
to companionship with others of the same species. 

Personality is an important factor that differenti- 
ates humans (Nye & Brower, 1996). The same 
events may trigger significantly different feelings 
and actions according to different sociabilities. 

An individual’s sociability may influence his or 
her actions and scene preferences. Some people 
may enjoy going out and socializing with friends 
while others prefer reading a book alone. Their 
different social preferences may further influence 
their choice of Internet communication interface 
and their preference of the quantity of social-pres- 
ence feelings. 

Activity Style 

The purpose of communication can be classified into 
two general categories: business oriented and social 
oriented. For business-oriented communication, 
people intend to grasp the information they need as 
soon as possible. On other hand, people use social- 
oriented communication to make friends, set up 
relationships, and create social networks. Table 1 
lists some typical business-oriented activities and 
social-oriented activities. 

Business-oriented activities may require an easy- 
to-use and concise environment, for example, an 
office, a conference room, or a classroom. In this 
kind of environment, people know who is in charge, 
know the problems they are trying to discuss, and 
intend to work out solutions as soon as possible. 

Social-oriented activities demand a relaxing, free, 
and highly sociable context, for example, a restau- 
rant, a bar, or a private garden. In this kind of 
environment, people can relax and enjoy their time. 
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Figure 1. The 2-D interface 






Peter 




2-D vs. 3-D INTERFACES 

One 2-D and one 3-D prototype interface were 
developed. The 2-D and 3-D interfaces only apply 
commonly available techniques to satisfy the univer- 
sality requirements for interface and experiment 
assessment. 

The 2-D interface is designed as a relatively 
simple environment. Similar to chatting environments 
over the Internet, the 2-D interface is based on 
textual message transmission. However, the user’s 
image can be displayed as well. After logging onto 
the 2-D interface, a user can easily find others logged 
onto the server and get an overview of the whole 



environment. A typical screen of the 2-D interface 
is shown in Figure 1 . 

The design idea of the 3-D interface emerged 
from real-time 3-D games, and the interface is 
converted from a 3-D maze. The interface is di- 
vided into different spaces by walls and users can 
move around with the aid of cursors. The 3-D 
interface displays at a glance within a position panel 
the current relative positions of all users. To engage 
in conversation with others, a user needs to walk 
close enough to another user. When a conversation 
starts, textual messages will be shown both within 
a message box outside of the 3-D space and in a 
dialog bubble within the environment. To represent 
each user, facial images are displayed. However, 
users have a choice to use cartoon images to 
represent themselves. Automatic facial-expression 
display is achieved in the 3-D interface by integrat- 
ing a text-to-emotion engine. A user’s text input is 
sent to the emotion-extraction engine to examine 
emotion information that is embedded. The 3-D 
interface receives the output from the emotion- 
extraction engine and displays corresponding ex- 
pression images. Further discussion about emotion- 
extraction engines can be found in Boucouvalas, 
Xu, and John (2003), Xu and Boucouvalas (2002), 
and Xu, John, and Boucouvalas (2003, in press). 

A typical screen of the 3-D interface is shown in 
Figure 2. 



s 



Figure 2. The 3-D interface 





Self: How about chatting over 
Interact? 

Other: Sounds great, when? 



Om|T«i | | 
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Figure 3. User-viewing component showing the 
movement of a user 




SOCIAL FEELINGS DISCUSSIONS 

In this section, we describe the possible different 
social feelings perceived by users. A total of eight 
different senses are discussed here. 

• Movement Senses: Similar to most chatting 
environments, every user in the 2-D chat pro- 
totype is in a fixed position. In contrast, the 3- 
D interface provides some aspects of move- 
ment. Users not only move around the space, 
but also have a specific field of view and can 
look for the spatial guidelines from the position- 
guide panel. 

• Geographic Senses: Unlike 2-D chatting in- 
terfaces, a 3-D interface allows for various 
complex geographic entities such as a city or 
something as simple as a tree. Users may 
perceive geographic-movement phenomena in 
the virtual movement. Figure 3 shows the field 



Figure 4. Different eye-contact angle 




(a) 

Direct glance 



of view and the position-guide component of 
the 3-D interface. 

• Sense of 3-D Depth of Space: Space depth is 
a widely presented feature in both 3-D games 
and in real life. However, most 2-D interfaces 
cannot provide the same feelings. An example 
of the sense of depth can be found in Figure 3, 
which shows the use of perspective. 

• Exploration of Space: For 2-D interfaces, the 
whole interface is presented to users. Users 
know at the beginning who is in the environ- 
ment, whom they are talking to, and what 
functions the interface provides. For 3-D inter- 
faces, users need to explore the space to meet 
others or to access the assistant functions 
provided by the system (e.g., buy something 
from a virtual shop). A position panel can only 
provide some aspects of the overall location 
and limited user information. 

• Eye Contact: Eye contact is also very impor- 
tant. It is impolite to turn our backs to people 
talking to us. For the 2-D text-based interface, 
users cannot move their positions, and the 
images representing them are fixed. Thus, no 
virtual eye contact can be established. The 3- 
D interface provides the possibility of making 
virtual eye contact as users move around the 3- 
D interface. Figure 4 demonstrates the viewing 
component of the 3-D interface, which shows 
a direct glance and side-glance. 

• Communication Efficiency: In the 2-D text- 
based system, a user’s input sentences can be 
viewed by everyone. However, when a large 
number of users are exchanging messages 
quickly, it is difficult to follow the messages of 
a particular user. For 2-D chatting interfaces, 




Side glance 
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Figure 5. Group discussion in different interfaces 
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(b) The 3-D interface 
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communication will not be efficient for a large 
number of people gathering in the same room. 
With the speech bubbles, users in a 3-D system 
can identify others’ chatting messages rela- 
tively easily. Users can concentrate on one 
user’s speech by moving and changing the eye 
angles. Figure 5 demonstrates a busy chat 
environment with a 2-D interface and a chat 
environment with a 3-D interface. 

• Social-Attraction Feeling: When some people 
gather together to discuss something, we may 
assume that interesting events or urgent situa- 
tions occurred. The social-attraction feeling 
still applies to computer communication. For 2- 
D interfaces, users can judge the number of 
users discussing a topic only by scrolling the 
text. For 3-D interfaces, users can find this 
information visually by glancing for a cluster of 
gathered users from the position panel and the 
viewing component. Figure 5b demonstrated 
this feeling in a 3-D interface. 



• Movement Plus Talk: In the 2-D text-based 
system, a user’s position is fixed. For the 3-D 
system, movement is a fundamental element. It 
is quite possible that some users may chat while 
moving. 

DESIGN CONSIDERATIONS 

Social feelings are important factors for Internet 
communication interfaces, and they may vary in 
different environments. First, our article focuses on 
answering the question, What kinds of social feel- 
ings may be perceived in different environments? A 
2-D interface and a 3-D interface will be compared 
in order to answer this question. 

Second, both activity styles and individuals’ so- 
ciability styles influence interface preferences and 
have different requirements on social-presence feel- 
ings. Which one is more important: the activity style 
or sociability — in other words, the activity or the 
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human? We expect that when an individual has a 
particular aim, for example, solving a crucial prob- 
lem or finding friends to talk to, then the activity style 
will dominate the preference of interface instead of 
the individual’s sociability style. 

The following phenomena are observed in Internet 
communications. 

1. The activity style will strongly influence the 
preference of social presence. 

2. Human sociability will strongly influence the 
overall preference of social presence. 

3. An individual’ s sociability style will not strongly 
influence the preference of individual inter- 
faces when the activity is chosen, hence the 
activity style is the dominant power for the 
interface preference. 

The following experiments will present a detailed 
discussion about the phenomena. 

SOCIABILITY AND 
ACTIVITY STYLE EXPERIMENT 

There are two main aims of this experiment. 

1. To examine the preference of participants for 
different interface styles when performing dif- 
ferent types of activities. 

2. To assess the effect that an individual’s socia- 
bility undertakes on the satisfaction rating of 
each style of interface. 

Two styles of interfaces are presented to participants. 

1 . A 2-D interface that is a less sociable environ- 
ment (Users are split into different rooms. 
Each room lists the current users online. Users 
formally request connections before joining 
conversations.) 

2. A 3-D interface that is a more sociable envi- 
ronment (All users explore the same 3-D space. 
All users are free to explore, approach other 
users, and engage them with conversation.) 



There are two types of activities considered. 

1. Business oriented (e.g., solving a technical 
problem, etc.) 

2. Social oriented (e.g., having a tea-break chat, 
etc.) 

THE EXPERIMENT PROCEDURE 

A total of 50 students and staff from Bournemouth 
University participated in the experiments. The gen- 
der of each participant is recorded and a question- 
naire assessing the sociability of each participant 
was shown. The questionnaire was developed by 
Bellamy and Hanewicz (1999) and contains seven 
items with five scale points ranging from agree to 
disagree in order to measure the sociability of the 
participants. 

Participants then viewed the two interfaces (2-D 
and 3-D). After viewing them, participants were 
shown a list of 1 2 activities that can be performed in 
both environments. The participants were instructed 
to select the style of interface that was best suited 
for specific activities. 

The 12 activities can be divided into two styles: 
business oriented and social oriented. Six of them 
belong to the business-oriented group and the other 
six are classified into the social-oriented group. The 
activities are shown in Table 2. 



EXPERIMENT RESULTS 

According to the activity style, we classify the 
results into two categories. The results of the busi- 
ness-oriented activities are shown in Figure 6, and 
Figure 7 presents the results of the social-oriented 
activities. 

The charts show that more participants chose the 
2-D interfaces for the business-oriented activities, 
and the 3-D interfaces were chosen for most social- 
oriented activities. 

Table 3 lists the summary of dependent variables 
(2x3). 
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Table 2. The name and style of the activities 



Activities 


Style 


Conduct a technical 
meeting over the Internet 
(Business 1) 


Business oriented 


Seek technical advice about 
your computer 
(Business 2) 


Business oriented 


Monitor your employee’s 

progress 

(Business 3) 


Business oriented 


Study online 
(Business 4) 


Business oriented 


Chat about the latest 
celebrity gossip 
(Social 1) 


Social oriented 


Seek new friends 
(Social 2) 


Social oriented 


Watch an animation 
(Social 3) 


Social oriented 


Play a multiplayer game 
(e.g., football) 

(Social 4) 


Social oriented 


Privately chat with your 
good friends 
(Social 5) 


Social oriented 


Do your math homework 
(e.g., 3x + 5y = 70) 
(Business 5) 


Business oriented 


Discuss stock-market news 
(Business 6) 


Business oriented 


Display an exhibition of 
your paintings 
(Social 6) 


Social oriented 



Figure 6. The business-oriented activities 



Figure 7. The social-oriented activities 
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Table 3. Value explanation 



Activity Style 


Interface Choice 


0: Business oriented 
1: Social oriented 


2: 2-D interface 
3: 3-D interface 
0: Neither 



• Activity Style vs. the Choice of Interfaces: 

Correlation tests were carried out to compare 
the correspondence between the choice of 
interface and the activity style. The test result 
is 0.597, which is significant at p = 0.01. The 
results demonstrate that there is a significant 
relationship between activity style and the pref- 
erence of interface. When users need to carry 
out a business-oriented activity, for example, 
finding an emergency telephone number, most 
people will prefer a simple interface that pre- 
sents a low level of social feelings. When users 
want to spend their spare time, for example, 
playing an online game, the preferred interface 
is a relative complex and vivid multichannel 
environment. The practical hint for interface 
design is if the purpose of the online communi- 
cation is to provide technical help or technical 
discussions for the users, a simple, straightfor- 
ward, and uncluttered environment will be pre- 
ferred by users. If the purpose of the online 
communication is to relax and to enjoy the 
online lifestyle, a vivid video- or audio-assisted 
environment will suit most online surfers. 

• Sociability vs. Choice of Interface: The 
correlation test does not find any significant 
link between the individuals’ sociability and 
interface preference. The results of the t-test 
show that the distribution of the preference 
ratings of highly sociable participants were not 
significantly different from the less sociable 
participants for the 12 activities. However, a 
marginally significant difference was found in 
a t-test between the two groups of the partici- 
pants for Business 1 and Social 2, in which p = 
0.08 and p = 0.09, correspondingly. To further 
explore the function of human sociability, we 
calculated the means of the 12 choices and 



repeated the t-tests. The results show that 
there is a marginally significant difference be- 
tween the ratings of the sociable and less 
sociable participants (p = 0.06) for the means 
of the 12 choices. The results show that of the 
overall level, sociability has a significant influ- 
ence on the preference of interface style. On 
average, low-sociability persons prefer simple 
and clean interfaces, and high-sociability per- 
sons prefer complex and realistic interfaces. 
However, for specific activities, human socia- 
bility has very limited influence on interface- 
style preference. The influence of an individual’ s 
sociability is much weaker than the influence of 
the activity style. This analysis provides an- 
other design criterion for online communica- 
tion. Designers should pay more consideration 
to the activity that may be carried out on the 
Web. However, if a series of online communi- 
cation interfaces will be presented to a specific 
user, human sociability should be considered 
and designers should adopt sociability into the 
design consideration. 

• Gender vs. the Choice of Interface: Are 

there any differences in interface preference 
between genders? Will a female prefer a vivid 
online environment more than a male will? To 
answer this question, a t-test and correlation 
test were carried out. The result shows that 
there is one marginally significant difference (p 
= 0.075) between the ratings of males and 
females that is found for Business 2, and one 
significant difference (p = 0.04) that is found 
for Social 3. However, there is no significant 
correlation that supports these effects. This 
analysis indicates that gender has a very lim- 
ited amount of influence on the preference of 
interface. 



530 



Social Factors and Interface Design Guidelines 



• Revisiting the Phenomena: It is now time to 
reexamine the phenomena. It can be seen that 
the activity style has a strong influence on the 
preference of the interface and the corre- 
sponding social-presence feelings. Overall, the 
sociability style of individuals strongly influ- 
ences the preferences of social feelings, but 
not at the individual activity level. The three 
phenomena were observed in the experiment. 



FUTURE TRENDS 

Adaptivity is an extremely important interface de- 
sign criteria. To create adaptive systems, human 
factors (e.g., emotion, cognition, and personality) 
need to be considered carefully. To attract the 
targeted audience, the system should analyse the 
potential users’ customs and hobbies. As the pur- 
pose of a software system varies, the purpose may 
influence the preference of the interface. This ar- 
ticle shows some general design guidelines and 
presents experiments to demonstrate the guidelines’ 
accuracy. Trends in human-computer interaction 
(HCI) design are the adoption of more human fac- 
tors into design consideration and the development 
of new guidelines (e.g., clear guidelines for sociabil- 
ity, gender, and age). 



CONCLUSION 

Social presence exists everywhere in the virtual 
world. The more social-presence feelings presented, 
the more realistic and more sensible the environment 
is. The argument of our article is the necessity to 
increase social presence everywhere in Internet 
communication. 

A 2-D interface and a 3-D interface were devel- 
oped. The feature comparisons between the two 
interfaces illustrate that a 2-D text-based interface 
is straightforward, and a 3-D interface provides 
some aspects of virtual-reality feelings, which are 
more complex. 

The experiment results show that significant 
differences exist in the preference of social pres- 
ence for different activities. The experiment results 
strongly show that social presence should be consid- 
ered for interface design. 



The individual’ s sociability may also influence his 
or her preference of social presence. However, a 
significantly different preference is only revealed at 
the overall level, not for most individual activities. 
This indicates that an individual’s sociability does 
impact his or her preference of social presence. 
However, when dealing with specific activities, the 
influence of the activity style is much stronger than 
the individual’ s sociability. 

Gender does influence the social preference in 
some specific activities. However, no significant 
preference difference can be found for the majority 
of activities and the overall level. This means the 
impact of gender needs further investigation. 

As high social presence may be linked with a 
vivid or multichannel (e.g., video, audio, or anima- 
tion) communication interface and a simple or text- 
dominated interface may present low social pres- 
ence, the experiment results also provide guidelines 
for HCI design. 



s 



REFERENCES 

Badre, A. N. (2002). Shaping Web usability: In- 
teraction design in context. Boston, MA: Addison- 
Wesley. 

Bailenson, J. N, Blascovich, J., Beall, A. C., & 
Loomis, J. M. (2001). Equilibrium theory revisited: 
Mutual gaze and personal space in virtual environ- 
ments. Presence: Teleoperators and Virtual En- 
vironments, 10(6), 583-598. 

Bellamy, A., &Hanewicz, C. (1999). Social psycho- 
logical dimensions of electronic communication. 
Electronic Journal of Sociology, 4(1). Retrieved 
from http://www.sociology.org/content/vol004.001/ 
bellamy.html 

Boucouvalas, A. C., Xu, Z., & John, D. (2003). 
Expressive image generator for an emotion extrac- 
tion engine. Proceedings of the 17 th Annual Hu- 
man-Computer Interaction Conference, Bath 
University, UK. 

Cato, J. (2001). User-centred Web design. London: 
Addison-Wesley. 

Dix, A., Finlay, J., Abowd, G., & Beale, R. (1998). 
Human computer interaction (2 nd ed.). London: 
Prentice Hall. 



531 




Social Factors and Interface Design Guidelines 



Loomis, J. M., Golledge, R. G., & Klatzky, R. L. 
(1998). Navigation system for the blind: Auditory 
display modes and guidance. Presence: 
Teleoperators and Virtual Environments, 7(2), 
192-203. 

McCraken, D. D„ & Wolfe, R. J. (2004). User- 
centred Website development: A human-computer 
interaction approach. Upper Saddle River, NJ: 
Pearson. 

Merriam-Webster online. (1996). Retrieved July 7, 
2003, from http://www.m-w.com/home.htm 

Neilsen, J. (2000). Designing Web usability. India- 
napolis, IN: New Riders. 

Nielsen, J., & Tahir, M. (2002). Homepage usabil- 
ity: 50 Websites deconstructed. Indianapolis, IN: 
New Riders. 

Nye, J. L., & Brower, A. M. (1996). What’s social 
about social cognition? London: Sage Publica- 
tions. 

Preece, J., Rogers, Y., & Sharp, H. (2002). Interac- 
tion design: Beyond human-computer interac- 
tion. New York: John Wiley & Sons. 

Short, J., Williams, E., & Christie, B. (1976). The 
social psychology of telecommunications. Lon- 
don: John Wiley & Sons. 

Witmer, B. G., & Singer, M. J. (1998). Measuring 
presence in virtual environments: A presence ques- 
tionnaire. Presence: Teleoperators and virtual 
environments, 7(3), 225-240. 

Xu, Z., & Boucouvalas, A. C. (2002). Text-to- 
emotion engine for real time Internet communica- 



tion. International Symposium on Communication 
Systems, Networks and DSPs, 164-168. 

Xu, Z., John, D., & Boucouvalas, A. C. (2003). 
Emotion extraction engine: Expressive image gen- 
erator. Proceedings of EUROMEDIA, Plymouth, 
UK. 

Xu, Z., John, D., & Boucouvalas, A. C. (in press). 
Emotion analyzer. Emotion Journal. 

KEY TERMS 

2- D Interface: An interfaces in which text and 
lines appear to be on the same flat level. 

2.5-D Interface: An interface that applies vari- 
ous graphical algorithms to simulate the sense of 
depth on a 2-D interface. 

3- D Interface: An interface in which text and 
images are not all on the same flat level. 

Emotion-Extraction Engine: A software sys- 
tem that can extract emotions embedded in textual 
messages. 

Sociability: The relative tendency or disposition 
to be sociable or associate with one’s fellows. 

Social Presence: The extent to which a person 
is perceived as a real person in computer-mediated 
communication. 

Task Analysis: A method of providing an ex- 
traction of the tasks users undertake when interact- 
ing with a system. 
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INTRODUCTION 
System Levels 

Computer systems have long been seen as more 
than just mechanical systems (Boulding, 1956). They 
seem to be systems in a general sense (Churchman, 
1979), with system elements, like a boundary, com- 
mon to other systems (Whitworth & Zaic, 2003). A 
computer system of chips and circuits is also a 
software system of information exchanges. Today, 
the system is also the human-computer combination 
(Alter, 1999); for example, a plane is mechanical, its 
computer controls are informational, but the plane 
plus pilot is also a system: a human-computer sys- 
tem. Human-computer interaction (HCI) sees com- 
puters as more than just technology (hardware and 
software). Computing has reinvented itself each 
decade or so, from hardware in the 1950s and 1960s, 
to commercial information processors in the 1970s, 
to personal computers in the 1980s, to computers as 
communication tools in the 1990s. At each stage, 
system performance increased. This decade seems 
to be that of social computing, in which software 
serves not just people but society, and systems like 
e-mail, chat rooms, and bulletin boards have a social 
level. Human-factors research has expanded from 
computer usability (individual), to computer-medi- 
ated communication (largely dyads), to virtual com- 
munities (social groups). The infrastructure is tech- 
nology, but the overall system is personal and social, 
with all that implies. Do social systems mediated by 
technology differ from those mediated by the natural 
world? The means of interaction, a computer net- 



work, is virtual, but the people involved are real. One 
can be as upset by an e-mail as by a letter. Online 
and physical communities have a different architec- 
tural base, but the social level is still people commu- 
nicating with people. This suggests computer-medi- 
ated communities operate by the same principles as 
physical communities; that is, virtual society is still a 
society, and friendships cross seamlessly from face- 
to-face to e-mail interaction. 

Table 1 suggests four computer system levels, 
matching the idea of an information system as 
hardware, software, people, and business processes 
(Alter, 2001). Social-technical systems arise when 
cognitive and social interaction is mediated by infor- 
mation technology rather than the natural world. 

BACKGROUND 

The Social-Technical Gap 

The levels of Table 1 are not different systems, but 
overlapping views of the same system. Higher levels 
depend on lower levels, so lower level failure implies 
failure at all levels above it; for example, if the 
hardware fails, the software does too as does the 
user interface. Higher levels are more efficient 
ways of operating the system as well as observing it. 
For example, social systems can generate enormous 
productivity. For this to occur, system design must 
recognize higher system-level needs. For example, 
usability drops when software design contradicts 
users’ cognitive needs. 



Table 1. Information system levels 



Level 


Examples 


Discipline 


Social 


Norms, culture, laws, Zeitgeist, sanctions, roles 


Sociology 


Cognitive 


Semantics, attitudes, beliefs, opinions, ideas, morals 


Psychology 


Information 


Software programs, data, bandwidth, memory, processing 


Computing 


Mechanical 


Hardware, computer, telephone, fax, physical space 


Engineering 
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In physical society, architecture normally fits 
social norms ; for example, you may not legally enter 
my house, and I can physically lock you out. In 
cyberspace, the architecture of interaction is the 
computer code that “makes cyberspace as it is” 
(Lessig, 2000). If this architecture ignores social 
requirements, there is a social-technical gap be- 
tween what computers do and what society wants 
(Figure 1). This seems a major problem facing social 
software today (Ackerman, 2000). Value-centered 
computing counters this gap by making software 
more social (Preece, 2000). 

Antisocial Interaction 

Social evolution involves specialization and coopera- 
tion on a larger and larger scale (Diamond, 1998). 
Villages became towns, then cities and metropolitan 
centers. The roving bands of 40,000 years ago 
formed tribes, chiefdoms, nation states, and 
megastates like Europe and the United States. Driv- 
ing this evolution are the larger synergies that larger 
societies allow. The Internet offers the largest soci- 
ety of all — global humanity — and potentially enor- 
mous synergies. To realize this social potential, 
software designers may need to recognize how 
societies generate nonzero-sum gains (Wright, 



2001). While nonzero sum is an unpleasant term, 
Wright’s argument that increasing the shared social 
pie is the key to social prosperity is strong. The logic 
that society can benefit everyone seems simple, yet 
communities have taken thousands of years to sta- 
bilize nonzero-sum benefits. Obviously, there is some 
resistance to social synergy. 

If social interactions are classified by the ex- 
pected outcome for the self and others (Table 2), 
situations where individuals gain at others’ expense 
are antisocial. Most illegal acts, like stealing, fall into 
this category. The equilibrium of antisocial interac- 
tion is that all parties defect when nonzero-sum gains 
are lost. Antisocial acts destabilize the nonzero-sum 
gains of society, so to prosper, society must reduce 
antisocial acts. This applies equally to online society. 
Users see an Internet filled with pop-up ads, spam, 
pornography, viruses, phishing, spoofs, spyware, 
browser hijacks, scams, and identity theft. These 
can be forgiven by seeing the Internet as an uncivi- 
lized place, a stone-age culture built on space-age 
technology, inhabited by the “hunter-gatherers of 
the information age” (Meyrowitz, 1985, p. 315). This 
is the “dark side” of the Internet, a worldwide 
“tangled web” for the unwary (Power, 2000), a 
superhighway of misinformation, a social dystopia 
beyond laws where antisocial acts reign. 



Figure 1. Social-technical gap 
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Users are naturally wary of such a society; that is, 
they do not trust it. Trust has been defined as 
expecting that another’s action will be beneficial 
rather than detrimental (Creed & Miles, 1996). An- 
tisocial acts, by definition, do not create trust. Lack 
of trust reduces interaction, especially if there is a 
less risky alternative. For example, while electronic 
commerce is a billion-dollar industry, it has consis- 
tently performed below expectations, though in online 
trade sellers reduce costs and buyers gain choice at 
a lower price. E-commerce benefits both customers 
and companies, so why is it not the majority of trade? 
Every day millions of customers who want to buy 
things browse thousands of Web sites for products 
and services, yet the majority purchase from brick- 
and-mortar, not online, sources (Salam, Rao, & Pegels, 
2003). If online society does not prevent antisocial 
acts, users will not trust it, and if they do not trust it, 
they will use it less. 

In the tragedy of the commons, acts that benefit 
individuals harm the social group, whose loss affects 
the individuals in it (Poundstone, 1992). If farmers 
graze a common grass area, a valuable common 
resource is destroyed (from overgrazing), yet if one 
farmer does not graze, another will. The tragedy 
occurs if individual economics drives the group to 
destroy a useful common resource. Most animal 
species are barely able to cross this individual-gain 
barrier to social synergy. Only insect colonies com- 
pare to humans in size, but each community is one 
genetic family, allowing selection for cooperative 
behavior (Ridley, 1996). Humanity has created social 
benefits without genetic selection. How did we cross 
the zero-sum barrier? The answer seems to be our 
ability to develop social systems. 

If the commons farmers form a village, it makes 
no sense for the village to destroy its own resource. 
If the village social system, of norms, rules, and 
sanctions, can stop individuals from overgrazing, the 
village keeps its commons and the benefits thereof. If 
only the village chief grazes the commons, there is an 
inherent instability between individual and commu- 
nity gain. However, if the commons is shared, say by 
a grazing roster, both village and members benefit. 
As society has evolved, bigger communities have 
produced more but also shared more. Social systems 
that spread social benefits fairly seem to stabilize 
nonzero-sum benefits better than those in which 
society’s benefits accrue only to a few. The social 



concept of fairness seems to reconcile the conflict 
between private benefit and public good. 



s 



LEGITIMATE INTERACTION: 

A SOCIAL REQUIREMENT 

The fact that social systems of law and justice are 
primarily about reducing unfairness in society (Rawls, 
2001) is necessary because in society, one person’s 
failure can cause another’s loss, and one person’s 
contribution can be another’s gain, for example, in 
software piracy. One way to reduce antisocial acts 
is to make people accountable for the effects of 
their acts not just on themselves but also on others. 
Without such accountability, perceptions of unfair- 
ness arise, for example, when people take benefits 
others earned, or pay no price for harming others. 
Unfairness is not just the unequal distribution of 
outcomes, but the failure to distribute outcomes 
according to action contributions. Studies suggest 
people react strongly to unfairness, tend to avoid 
unfair situations (Adams, 1965), and even prefer 
fairness to personal benefit (Lind & Tyler, 1988). 
This natural justice perception seems to underlie 
our ability to form positive societies. Progress in 
legitimate rights seems to correlate with social 
wealth, as does social corruption with community 
poverty (Eigen, 2003). Perhaps people in fair soci- 
eties contribute more work, ideas, and research 
because others do not steal it, or self-regulate more, 
which reduces security costs. Either way, account- 
ability (or justice) seems a requirement for social 
prosperity. 

The social goal has been defined as legitimate 
interaction that is fair to individuals and beneficial to 
the social group (Whitworth & deMoor, 2003). 
Legitimacy is a complex social concept. Fairness 
alone does not define it as conflict can also be fair. 
A duel is a fair fight, but duels are still outlawed as 
being against society. Legitimate interaction in- 
cludes public-good benefits as well as individual 
fairness. In sociology, the term legitimate applies to 
governments that are justified to their people, not 
coerced (Barker, 1990). It can mean having the 
sanction of law, but legitimacy is more than legality. 
Mill (1859/1995, p. 1) talks of the “limits of power 
that can be legitimately exercised by society over 
the individual.” Jefferson wrote, "... the mass of 
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mankind has not been born with saddles on their 
backs, nor a favored few booted and spurred, ready 
to ride them legitimately.. (Somerville & Santoni, 
1963, p. 246). Fukuyama (1992) argues that legiti- 
mate communities prosper, while those that ignore it 
do so at their peril. These statements have no 
meaning if legitimacy and legality are the same, as 
then no law-setting government could act illegiti- 
mately. 

The social requirement of legitimacy comple- 
ments that of security. Security ensures a system is 
used as intended, while legitimacy defines that in- 
tent. Whether a user is who he or she says (authen- 
tication) is a security issue. What rights he or she 
should have (authority) is a legitimacy issue. In 
generating trust and business, no amount of security 
can compensate for a lack of legitimacy. Dictator- 
ships have powerful security forces, but their citi- 
zens distrust them, reducing social synergy. In pros- 
perous modern societies, security is directed by 
legitimacy, and legitimacy depends on security. 

Online Legitimacy 

Physical society uses various means to prevent 
antisocial acts from destabilizing social benefits, 
including the following. 

1 . Ethics: Supports right acts by religion or custom 

2. Barriers: Fences, doors, or locks to prevent 
unfair acts 

3. Revenge: Individuals “pay back” those that 
cheat 

4. Norms: Community laws, sanctions, and police 

All have also been tried in cyberspace, with 
varying degrees of success. 

Arguably the best means to legitimate interaction 
is to have moral, ethical people, who choose not to 
cheat. But while most agree altruism is good and 
selfishness bad, we often do not practice what we 
preach (Ridley, 1996). Will online society make 
people more ethical than physical society? 

Barriers, like a locked door, can prevent unfair- 
ness, but any barrier raised can be overcome. Online 
security is a continual battle between those who 
create and those who cross barriers. Also, barriers 
can reduce as well as increase fairness. Do we 



really want a cyber society built on the model of 
medieval fortresses? 

A third way to legitimate interaction is through 
revenge: to repay actions in kind, or cheat the 
cheaters (Boyd, 1992). In Axelrod’s ( 1984) prisoner’ s 
dilemma tournament, the most successful program 
was TIT-FOR-TAT, which began cooperating, then 
copied whatever the other did. If people who are 
cheated today will take revenge tomorrow, cheating 
may not be worth it, but do we want cyber society 
run under a vigilante justice system? 

A fourth way for society to support legitimate 
interaction is by norms and laws. If laws oppose 
antisocial acts, why not apply laws online? This 
approach is popular, but old means may fail in new 
system environments (Whitworth & deMoor, 2003). 
Laws assume a physical-world architecture so may 
not easily transfer to virtual worlds that work differ- 
ently from the physical world (Burk, 2001). Legal 
processes may suffice for physical change, but 
while laws can take years to pass, the Internet can 
change in a month. New cases, like cookies, can 
arise faster than laws can be formed, like weeds 
growing faster than they are culled. Also, the pro- 
grammers who define cyberspace can bypass any 
law. The Internet, once thought innately ungovern- 
able, could easily become a system of perfect regu- 
lation and control (Lessig, 1999) as once software is 
written, issues of law may have already been de- 
cided. Finally, laws are limited by jurisdiction, as 
attempts to legislate telemarketers illustrate. U.S. 
law applies to U.S. soil, but cyberspace does not 
exist inside America. The many laws of many na- 
tions do not apply to a global Internet. For these 
reasons, the long arm of the law struggles to reach 
into cyberspace. The case is still out, but many are 
pessimistic. Traditional law seems too physical, too 
slow, too impotent, and too restricted for the chal- 
lenge of a global information society. 

FUTURE TRENDS 

That the social needs of online society are not yet 
met suggests two things. First, Internet growth may 
be just beginning, and second, meeting social needs 
is the way to achieve that growth. Perhaps we are 
only seeing the start of a major human social evolu- 
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tion. We may be no more able to envisage a global 
information society than people in the middle ages 
could conceive today’s global trade system. The 
differences are not just technical, like ships and 
airplanes, but also social, differences in how we 
interact. Traders today send millions of dollars to 
foreigners they have never seen for goods they have 
not touched to arrive at unknown times. Past traders 
would have seen that as mere folly, but today’s 
market economy has social as well as technical 
support: 

To participate in a market economy, to be willing 
to ship goods to distant destinations and to invest 
in projects that will come to fruition or pay 
dividends only in the future, requires confidence, 
the confidence that ownership is secure and 
payment dependable ...knowing that if the other 
reneges, the state will step in... (Mandelbaum, 
2002, p.272) 

Social benefits require the influence of social 
entities, like the state. Individual parties in an inter- 
action are biased to their own benefit. Only a 
community can embody legitimate rules above indi- 
viduals, yet these must be manifested as well as 
conceived. The concept of the state assumes physi- 
cal boundaries that do not exist in cyberspace. For 
online society to flourish, the gap between social 
right and software might must be closed, but stretch- 
ing physical law into cyberspace is problematic 
(Samuelson, 2003). Physical laws operate after the 
fact for practical reasons: To punish unfairness, it 
must first occur. Yet in cyberspace, we write the 
code that defines all interaction. It is as if we could 
write the laws of physics in the physical world. 
Flence, a new possibility arises. Why not focus on 
the solution (legitimacy) rather than the problem 
(unfairness)? Why let antisocial acts like spam 
develop, then try ineffectually to punish them when 
we can design for social fairness in the first place? 
When societies move from punishing unfairness to 
encouraging legitimacy, it is a major advance, from 
the laws of Moses or Hammurabai to visionary 
statements of social opportunity like the French 
Declaration of Human Rights or the United States 
constitution. Cyberspace is a chance to apply sev- 
eral thousand years of social learning to the global 
electronic village; designing social software in a 



social vacuum may condemn us to relearn the social 
lessons of physical history in cyberspace. 

In physical society, it was the push for distributed 
ownership that created social rights; the original 
pursuers of rights were British elite seeking property 
rights from their King: “It was the protection of 
property that gave birth, historically, to political 
rights” (Mandelbaum, 2002, p. 271). Over time, the 
right to own was extended to all citizens, as giving 
today’s freedoms proved profitable. Ownership as a 
concept can be applied online. Twenty years ago, 
issues of “Who owns the material entered in a group 
communication space?” (Hiltz & Turoff, 1993, p. 
505) were raised. If information objects can be 
owned, a social property-rights framework can be 
applied to information systems (Rose, 2001). 
Analysing who owns what can translate social state- 
ments into IS specifications and vice versa (Whitworth 
& deMoor, 2003; Figure 2). 

Future social-software designers may face ques- 
tions of what should be done, not what can be done. 
There seems no reason why software should not 
support what society believes. If society believes 
people should be free, our Hotmail avatars should 
belong to us. If society gives a right not to commu- 
nicate (Warren & Brandeis, 1890), we should be 
able to refuse spam (Whitworth & Whitworth, 2004). 
If society supports privacy, we should be able to 
remove personal data from online lists. If society 
gives creators rights to the fruits of their labors 
(Locke, 1963), we should be able to sign and own 
electronic items. If society believes in democracy, 
online bulletin boards should be able to elect their 
leaders. Such suggestions do not mean the mechani- 
zation of online interaction: Social rights do not work 
that way. Society grants people privacy, but does not 
force them to be private. Likewise, owning a bulle- 
tin-board item means you may delete it, not that you 
must delete it. Software support for social rights 
would allocate rights to act, not automate right acts, 
giving choice to people to not to program code. 
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Figure 2. Social-requirements analysis 
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CONCLUSION 

The core Internet architecture was designed over 30 
years ago to engineering requirements existing when 
a global electronic society was not even envisaged. 
It seems due for an overhaul to meet the social needs 
of virtual society. Architecture, whether physical or 
electronic, affects everything, and social systems 
require precisely such general changes. The mar- 
riage of society and technology needs respect on 
both sides. To close the social-technical gap, tech- 
nologists cannot stand on the sidelines: They must 
help. System designers must recognize accepted 
social concepts, like freedom, privacy, and democ- 
racy, that is, specify social requirements as they do 
technical ones. Translating social requirements into 
technical specifications is a daunting task, but the 
alternative is an antisocial cyber society that is not a 
nice place to be. If human society is to expand into 
cyberspace, with all the benefits that implies, tech- 
nology must support social requirements. The new 
user of social-technical software is society, and the 
user requirement of society is legitimate interaction. 
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KEY TERMS 

Avatar: An information object that represents a 
person in cyberspace, whether a Hotmail text ID or 
a graphical multimedia image in an online multiplayer 
game. 

Information System: A general system that 
may include hardware, software, people, and busi- 
ness or community structures and processes (Alter, 
1999, 2001), vs. a social-technical system, which 
must include all four levels. 

Nonzero Sum: In zero-sum interaction, one 
party gains at another’s expense so the parties 
compete. Negative acts that harm others but benefit 
the actor give an “equilibrium” point at which every- 
one defects and everyone loses (Poundstone, 1992). 
In contrast, in nonzero-sum interaction, parties co- 
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operate to increase the shared resource pie, so they 
gain more than they could have working alone: It is 
a win-win situation. The synergistic benefits of 
society seem based on nonzero-sum gains (Wright, 
2001 ). 

Social System: Physical society is not just me- 
chanics nor is it just information, as without people 
information has no meaning. Yet it is also more than 
people. Countries with people of similar nature and 
abilities, like North and South Korea, or East and 
West Germany, performed differently as societies. 
As people come and go, we say the society contin- 
ues. Jewish individuals of 2,000 years ago have died 
just as the Romans of that time, yet we say the Jews 
survived while the Romans did not. What survived 
was not buildings, information, or people, but a 
manner of interaction: their social system. A social 
system is a general form of human interaction that 
persists despite changes in individuals, communica- 
tions, or architecture (Whitworth & deMoor, 2003) 
based on persistent common cognitions regarding 
ethics, social structures, roles, and norms. 

System: A system must exist within a world and 
cannot exist if its world is undefined: No world 
means no system. Existence is a property a system 
derives from the world around it. The nature of a 
system is the nature of the world that contains it; for 
example, a physical world, a world of ideas, and a 
social world may contain physical systems, idea 
systems, and social systems, respectively. A system 
that exists still needs an identity to define what is a 
system and what is not a system. A system indistin- 
guishable from its world is not a system; for example, 
a crystal of sugar that dissolves in water still has 
existence as sugar, but is no longer a separate 
macroscopic system. The point separating system 
from nonsystem is the system boundary. Existence 
and identity seem two basic requirements of any 
system. 

System Elements: An advanced system has a 
boundary, an internal structure, environment effec- 
tors, and receptors (Whitworth & Zaic, 2003). Simple 
biological systems (cells) formed a cell-wall bound- 
ary and organelles for internal cell functions (Alberts 
et al., 1994). Simple cells like Giardia developed 
flagella to effect movement, and protozoa developed 
light-sensitive receptors. We ourselves, though more 
complex, still have a boundary (skin), an internal 



structure of organs, muscle effectors, and sense 
receptors. Computer systems have the same ele- 
ments: a physical-case boundary, an internal archi- 
tecture, printer and screen effectors, and keyboard 
and mouse receptors. These elements apply at dif- 
ferent levels; for example, software systems have 
memory boundaries, internal program structures, 
specialized input analysers, and specialized output 
driver units. 

System Environment: In a changing world, 
changes outside a system may cause changes within 
it, and changes within may cause changes without. 
A system’s environment is that part of a world that 
can change the system or be affected by it. What 
succeeds in the system-environment interaction de- 
pends on the environment. In Darwinian evolution, 
the environment defines system performance. Three 
things seem relevant: opportunities, threats, and the 
rates by which these change. In an opportunistic 
environment, right action can give great benefit. In 
a risky environment, wrong action can give great 
loss. In a dynamic environment, risk and opportunity 
change quickly, giving turbulence (sudden risk) or 
luck (sudden opportunity). An environment can be of 
any combination, for example, opportunistic, risky, 
and dynamic. 

System Levels: Is the physical world the only 
real world? Are physical systems the only possible 
systems? The term information system suggests 
otherwise. Philosophers propose idea systems in 
logical worlds. Sociologists propose social systems. 
Psychologists propose cognitive mental models. 
Software designers propose data entity relationship 
models quite apart from hardware. Software cannot 
exist without a hardware system of chips and cir- 
cuits, but the software world of data records and 
files is not equivalent to the hardware world. It is a 
different system level. Initially, computer problems 
were mainly hardware problems, like overheating. 
Solving these led to software problems, like infinite 
loops. Informational requirements began to drive 
chip development, for example, network and data- 
base protocol needs. HCI added cognitive require- 
ments to the mix. Usability demands are now part of 
engineering-requirements analysis (Sanders & 
McCormick, 1993) because Web sites fail if people 
reject them (Goodwin, 1987). Finally, a computer- 
mediated community can also be seen as a social 
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system. An information system can be conceived on 
four levels: mechanical, informational, cognitive, and 
social. Each emerges from the previous, not in some 
mystical way, but as a different framing of the same 
thing. For example, information derives from me- 
chanics, human cognitions from information, and 
society from a sum of human cognitions (Whitworth 
& Zaic, 2003). If all levels derive from hardware, 
why not just use that perspective? Describing mod- 
ern computers by chip and line events is possible but 
inefficient, like describing World War II in terms of 
atoms and electrons. As higher levels come into 
play, systems become more complex but also offer 
higher performance efficiencies. 



System Performance: A traditional information 
system’s performance is its functionality, but func- 
tions people cannot use do not add performance. If 
system performance is how successfully a system 
interacts with its environment, usability can join 
nonfunctional IS requirements, like security and 
reliability, as part of system performance. The four 
advanced system elements (boundary, internal struc- 
ture, effectors, and receptors) can maximize oppor- 
tunity or minimize risk in a system environment. A 
multidimensional approach to system performance, 
as suggested by Chung, Nixon, Yu, and Mylopoulos 
(1999), suggests eight general system goals appli- 
cable to modern software: functionality, usability, 
reliability, flexibility, security, extendibility, connec- 
tivity, and confidentiality (Whitworth & Zaic, 2003). 
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INTRODUCTION 

Socio-cognitive engineering is a framework for the 
systematic design of socio-technical systems (people 
and their interaction with technology), based on the 
study and analysis of how people think, learn, per- 
ceive, work, and interact. The framework has been 
applied to the design of a broad range of human- 
centered technologies, including a Writer’s Assis- 
tant (Sharpies, Goodlet, & Pemberton, 1992), a 
training system for neuroradiologists (Sharpies et 
al., 2000), and a mobile learning device for children 
(Sharpies, Corlett, & Westmancott, 2002). It has 
been adopted by the European MOBIlearn project 
(www.mobilearn.org) to develop mobile technology 
for learning. It also has been taught to undergraduate 
and postgraduate students to guide their interactive 
systems projects. An overview of the framework 
can be found in Sharpies et al. (2002). 

BACKGROUND 

The approach of socio-cognitive engineering is simi- 
lar to user-centered design (Norman & Draper, 
1986) in that it builds on studies of potential users of 
the technology and involves them in the design 
process. But users are not always reliable infor- 
mants. They may idealize their methods, describing 
ways in which they would like to or have been told 
to work, rather than their actual practices. Although 
users may be able to describe their own styles and 
strategies of working, they may not be aware of how 
other people can perform a task differently and 
possibly more effectively. Surveys of user prefer- 
ences can result in new technology that is simply an 
accumulation of features rather than an integrated 
system. 

Thus, socio-cognitive engineering is critical for 
the reliability for user reports. It extends beyond 
individual users to form a composite picture of the 
human knowledge and activity, including cognitive 



processes and social interactions, styles and strate- 
gies of working, and language and patterns of com- 
munication. The term actor is used rather than user 
to indicate that the design may involve people who 
are stakeholders in the new technology but are not 
direct users of it. 

The framework extends previous work in soft 
systems (Checkland & Scholes, 1990), socio-techni- 
cal and cooperative design (Greenbaum & Kyng, 
1991;Mumford, 1995;Sachs, 1995), and the applica- 
tion of ethnography to system design (see Rogers & 
Bellotti [1997] for areview). It incorporates existing 
methods of knowledge engineering, task analysis, 
and object-oriented design, but integrates them into 
a coherent methodology that places equal emphasis 
on software, task, knowledge, and organizational 
engineering. 

The framework also clearly distinguishes study- 
ing everyday activities using existing technology 
from studying how the activity changes with pro- 
posed technology. It emphasizes the dialectic be- 
tween people and artefacts; using artefacts changes 
people’ s activities, which, in turn, leads to new needs 
and opportunities for design. 

FRAMEWORK 

Figure 1 gives a picture of the flow and main 
products of the design process. It is in two main 
parts: a phase of activity analysis to interpret how 
people work and interact with their current tools and 
technologies, and a phase of systems design to build 
and implement new interactive technology. The 
bridge between the two is the relationship between 
the Task Model and the Design Concept. Each 
phase comprises stages of analysis and design that 
are implemented through specific methods. The 
framework does not prescribe which methods to 
use; the choice depends on the type and scale of the 
project. 
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Figure 1. Overview of the flow and main products of the design process 
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It is important to note that the process is not a 
simple sequence but involves a dialogue between the 
stages. Earlier decisions and outcomes may need to 
be revised in order to take account of later findings. 
When the system is deployed, it will enable and 
support new activities, requiring another cycle of 
analysis, revision of the Task Model, and further 
opportunities for design. 

The elements of socio-cognitive engineering are 
as follows: 

• Project: The diagram shows the process of 
design, implementation, and deployment for a 
single project. 

• Actors: Different types of people may be 
involved in or affected by the design and de- 
ployment, including (depending on the scale of 
the project) design, marketing and technical 
support teams, direct users of the system, and 
other people affected by it (e.g., administrative 
staff). 

• Roles: The actors take on roles (e.g., team 
leader), which may change during the project. 

• Stage: Each box represents one stage of the 
project. 

• Methods: Each stage can be carried out by 
one or more methods of analysis and design, 
which need to be specified before starting the 
stage. 

• Tools: Each method has associated tools (for 
activity analysis, software specification, sys- 
tems design, and evaluation) in order to carry 
out the method. 



• Outcomes: Each stage has outcomes that 
must be documented, and these are used to 
inform and validate the system design. 

• Measures: Each design decision must be 
validated by reference to outcomes from one 
of the stages. 

The general sequence for socio-cognitive engi- 
neering is as follows: 

1 . Form a project team. 

2. Produce General Requirements for the project. 

3. Decide which methods and tools will be used 
for each stage of the project. 

4. Decide how the process and outcomes will be 
documented. 

5. Decide how the project will be evaluated. 

6. Carry out each stage of the project, ensuring 
that the requirements match the design. 

7. Carry out a continuous process of documenta- 
tion and evaluation. 

The process starts by specifying the General 
Requirements for the system to be designed. These 
provide broad yet precise initial requirements and 
constraints for the proposed system in language that 
designers and customers can understand. They are 
used to guide the design and to provide a reference 
for validation of the system. The requirements nor- 
mally should indicate: 

• The scope of the project; 

• The main actors involved in designing, deploy- 
ing, using, and maintaining the system; 
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• The market need and business case; and 

• General attributes and constraints of the pro- 
posed system (i.e., whether it aims to support 
individual or collaborative working). 

The requirements will be extended and made 
more precise as the project progresses. 

This leads to two parallel studies: a theory-based 
study of the underlying cognitive processes and so- 
cial activities, and an investigation into how everyday 
activities are performed in their normal contexts. The 
Theory of Use involves an analysis of relevant 
literature from cognitive psychology, social sciences, 
and business management to form a rich picture of 
the human knowledge and activity. It is essential that 
this should offer a clear guide to system design. Thus, 
it must be relevant to the intended use of the system 
and extend the requirements in a form that can be 
interpreted by software designers and engineers. 

The aim of carrying out Field Studies is to 
uncover how people interact with current technology 
in their normal contexts. The role of the fieldworker 
is both to interpret activity and to assist technology 
design and organizational change. This addresses the 
widely recognized problem of ethnographic ap- 
proaches that, while they can provide an understand- 
ing of current work practices, are not intended to 
explore the consequences of socio-technical change. 

Table 1 shows a multi-level structure for field 
studies, with level 1 consisting of a survey of the 



existing organizational structures and schedules, 
levels 2 and 3 providing an analysis of situated 
practices and interactions of those for whom the 
technology is intended, and level 4 offering a syn- 
thesis of the findings in terms of designs for new 
socio-technical systems. The four levels give an 
overview of activity, leading to a more detailed 
investigation of particular problem areas, with each 
level illuminating the situated practices and also 
providing a set of issues to be addressed for the next 
level. These piece together into a composite picture 
of how people interact with technology in their 
everyday lives, the limitations of existing practices, 
and ways in which they could be improved by new 
technology. 

The outcomes of these two studies are synthe- 
sized into a Task Model. This is a synthesis of 
theory and practice related to how people perform 
relevant activities with their existing technologies. 
It is the least intuitive aspect of socio-cognitive 
engineering; it is tempting to reduce it to a set of 
bullet-point issues, yet it provides a foundation for 
the systems design. It could indicate: 

• The main actors and their activity systems; 

• How the actors employ tools and resources to 
mediate their interaction and to externalize 
cognition; 

• How the actors represent knowledge to them- 
selves and to others; 



Table 1. Multi-level structure for field studies 



Level 1 
Activity: 
Purpose: 
Outcome: 



Level 2 
Activity: 
Purpose: 

Outcome: 



Level 3 
Activity: 

Purpose: 



Outcome: 



Level 4 
Activity: 

Purpose: 

Outcome: 



Activity structures and schedules 

Study work plans, organizational structures, syllabuses, resources. 

To discover how the activities are supposed to be conducted. 

Description of the existing organizational and workplace structures; identification 
of significant events. 

Significant events 

Observe representative formal and informal meetings and forms of communication. 
To discover how activities, communication, and social interaction are conducted in 
practice. 

A description and analysis of events that might be important to system design; 
identification of mismatches between how activity has been scheduled and how it is 
has been observed to happen. 

Conceptions and conflicts 

Conduct interviews with participants to discuss areas of activity needing support, 
breakdowns, issues, differences in conception. 

To determine people’s differing conceptions of their activity; uncover issues of 
concern in relation to new technology; explore mismatches between what is 
perceived to happen and what has been observed. 

Issues in everyday life and interactions with existing technology that could be 
addressed by new technology and working practices. 

Determining designs 

Elicitation of requirements; design space mapping; formative evaluation of 
prototypes. 

To develop new system designs. 

Prototype technologies and recommendations for deployment. 
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• The methods and techniques that the actors 
employ, including differences in approach and 
strategy; 

• The contexts in which the activities occur; 

• The implicit conventions and constraints that 
influence the activity; and 

• The actors’ conceptions of their work, includ- 
ing sources of difficulty and breakdown in 
activity and their attitudes toward the introduc- 
tion of new technology. 

The Design Concept needs to be developed in 
relation to the Task Model. It should indicate how the 
activities identified by the Task Model could be 
transformed or enhanced with the new technology. 
It should: 

• Indicate how limitations from the Task Model 
will be addressed by new technology; 

• Outline a system image (Norman, 1986) for the 
new technology; 

• Show the look and feel of the proposed technol- 
ogy; 

• Indicate the contexts of use of the enhanced 
activity and technology; and 

• Propose any further requirements that have 
been produced as a result of constructing the 
Design Concept. 

The Design Concept should result in a set of 
detailed design requirements and options that can be 
explored through the design space. 



The relationship between the Task Model and 
Design Concept provides the bridge to a cycle of 
iterative design that includes: 
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• Generating a space of possible system designs, 
systematically exploring design option and jus- 
tifying design decisions ; 

• Specifying the functional and non-functional 
aspects of the system; 

• Implementing the system; and 

• Deploying and maintaining the system. 

Although these stages are based on a conven- 
tional process of interactive systems design (see 
Preece, Rogers, & Sharp [2002] for an overview), 
they give equal emphasis to cognitive and organiza- 
tional factors as well as to task and software speci- 
fications. The stages shown in Figure 1 are an aid to 
project planning but are not sufficiently detailed to 
show all the design activities. Nor does the figure 
make clear that to construct a successful integrated 
system requires the designers to integrate software 
engineering with design for human cognition, social 
interaction, and organizational management. The 
‘building-block diagram in Table 2 gives a more 
detailed picture of the system’s design process. 

The four “pillars” indicate the main processes of 
software, task, knowledge, and organizational engi- 
neering. Each “brick” in the diagram shows one 
outcome of a design stage, but it is not necessary to 
build systematically from the bottom up. A design 
team may work on one pillar (e.g., knowledge engi- 



Table 2. A building-block framework for socio-cognitive system design 





Software 

Engineering 


Task 

Engineering 


Knowledge 

Engineering 


Organizational 

Engineering 


Maintain 


Installed system 


New task 
structure 


Augmented 

knowledge 


New 

organizational 

structure 


Evaluate 


Debugging 


Usability 


Conceptual 
change, skill 
development 


Organizational 

change 


Integrate 


Prototype System 


Implement 


Prototypes, 

Documentation 


Interfaces, 
Cognitive Tools 


Knowledge 

Representation 


Communications, 

Network 

Resources 


Design 


Algorithms and 
Heuristics 


Human-Computer 

Interaction 


Domain Map, 
User Model 


Socio-Technical 

System 


Interpret 


Task Model 


Analyze 


Requirements 


Tasks: Goals, 
Objects, Methods 


Knowledge: 
Concepts, Skills 


Workplace: 

Practices, 

Interactions 


Survey 


Existing Systems 


Conventional 
Task Structures 
and Processes 


Domain 

Knowledge 


Organizational 
Structures and 
Schedules 


Propose 


General Requirements 



545 




Socio-Cognitive Engineering 



neering) up to the stage of system requirements, or 
ot may develop an early prototype based on a 
detailed task analysis but without a systematic ap- 
proach to software engineering. How each activity 
is carried out depends on the particular application 
domain, actors, and contexts of use. 

The design activities are modular, allowing the 
designer to select one or more methods of conduct- 
ing the activity according to the problem and domain. 
For example, the usability evaluation could include 
an appropriate selection of general methods for 
assessing usability, or it could include an evaluation 
designed for the particular domain. 

It should be emphasized that the blocks are not 
fixed entities. As each level of the system is devel- 
oped and deployed, it will affect the levels that follow 
(e.g., building a prototype system may lead to revis- 
ing the documentation or re-evaluating the human- 
computer interaction; deploying the system will cre- 
ate new activities). These changes need to be ana- 
lyzed and supported through a combination of new 
technology and new work practices. Thus, the build- 
ing blocks must be revisited both individually to 
analyze and update the technology in use, and through 
a larger process of iterative redesign. 

Although Table 1 shows system evaluation as a 
distinct phase, there also will be a continual process 
of testing to verify and validate the design, as shown 
in Figure 1 . Testing is an integral part of the entire 
design process, and it is important to see it as a 
lifecycle process (Meek & Sharpies, 2001) with the 
results of testing early designs and prototypes being 
passed forward to provide an understanding of how 
to deploy and implement the system, and the out- 
comes of user trials being fed back to assist in fixing 
bugs and improving the design choices. 

The result of the socio-cognitive engineering 
process is a new socio-technical system consisting 
of new technology, its associated documentation, 
and proposed methods of use. When this is deployed 
in the workplace, home, or other location, it not only 
should produce bugs and limitations that need to be 
addressed but also engender new patterns of work, 
social, and organizational structures that become 
contexts for further analysis and design. 



FUTURE TRENDS 

The computer and communications industries are 
starting to recognize the importance of adopting a 
human-centered approach to the design of new 
socio-technical systems. They are merging their 
existing engineering, business, industrial design, and 
marketing methods into an integrated process, un- 
derpinned by rigorous techniques to capture require- 
ments, define goals, predict costs, plan activities, 
specify designs, and evaluate outcomes. IBM, for 
example, has developed the method of User Engi- 
neering to design for the total user experience (IBM, 
2004). As Web-based technology becomes embed- 
ded into everyday life, it increasingly will be impor- 
tant to understand and design distributed systems for 
which there are no clear boundaries between people 
and technology. 

CONCLUSION 

Socio-cognitive engineering forms part of an historic 
progression from user-centered design and soft 
systems analysis toward a comprehensive and rigor- 
ous process of socio-technical systems design and 
evaluation. It has been applied through a broad range 
of projects for innovative human technology and is 
still being developed, most recently as part of the 
European MOBIlearn project. 
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KEY TERMS 

Activity System: The assembly and interaction 
of people and artefacts considered as a holistic 
system that performs purposeful activities. See http :/ 
/www. edu.helsinki.fi/activity/pages/chatanddwr/ 
activitysystem/ 

Human-Centred Design: The process of de- 
signing socio-technical systems (people in interac- 
tion with technology) based on an analysis of how 
people think, learn, perceive, work, and interact. 

Socio-Technical System: A system comprising 
people and their interactions with technology (e.g., 
the World Wide Web). 

Soft Systems Methodology: An approach de- 
veloped by Peter Checkland to analyze complex 
problem situations containing social, organizational, 
and political activities. 

System Image: A term coined by Don Norman 
(1986) to describe the guiding metaphor or model of 
the system that a designer presents to users (e.g., 
the desktop metaphor or the telephone as a “speak- 
ing tube”). The designer should aim to create a 
system image that is consistent and familiar (where 
possible) and enables the user to make productive 
analogies. 

Task Analysis: An analysis of the actions and / 
or knowledge and thinking that a user performs to 
achieve a task. See http://www.usabilitynet.org/ 
tools/taskanalysis.htm 

User-Centered Design: A well-established 
process of designing technology that meets users’ 
expectations or that involves potential users in the 
design process. 

User Engineering: A phrase used by IBM to 
describe an integrated process of developing prod- 
ucts that satisfy and delight users. 
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INTRODUCTION 

Technology Affecting CBISs 

As computer technology continues to leapfrog for- 
ward, CBISs are changing rapidly. These changes 
are having an enormous impact on the capabilities of 
organizational systems (Turban, Rainer, & Potter, 
2001). The major ICT developments affecting CBISs 
can be categorized in three groupings: hardware- 
related, software-related, and hybrid cooperative 
environments. 

Hardware-Related 

Hardware consists of everything in the “physical 
layer” of the CBISs. For example, hardware can 
include servers, workstations, networks, telecom- 
munication equipment, fiber-optic cables, handheld 
computers, scanners, digital capture devices, and 
other technology-based infrastructure (Shelly, 
Cashman, & Rosenblatt, 2003). Hardware-related 
developments relate to the ongoing advances in the 
hardware aspects of CBISs. 

Software-Related 

Software refers to the programs that control the 
hardware and produce the desired information or 
results (Shelly et al., 2003). Software-related devel- 
opments in CBIS are related to the ongoing advances 
in the software aspects of computing technology. 

Hybrid Cooperative Environments 

Hybrid cooperative environments developments are 
related to the ongoing advance in the hardware and 
software aspects of computing technology. These 



technologies create new opportunities on the Web 
(e.g., multimedia and virtual reality) while others 
fulfill specific needs on the Web (e.g., electronic 
commerce (EC) and integrated home computing). 

These ICT developments are important compo- 
nents to be considered in the development of CBIS’ s. 
As new types of technology are developed, new 
standards are set for future development. The ad- 
vent of hand-held computer devices is one such 
example. 

BACKGROUND 

A Software Engineering View 

In an effort to increase the success rate of informa- 
tion systems implementation, the field of software 
engineering (SE) has developed many techniques. 
Despite many software success stories, a consider- 
able amount of software is still being delivered late, 
over budget, and with residual faults (Schach, 2002). 

The field of SE is concerned with the develop- 
ment of software systems using sound engineering 
principles for both technical and non-technical as- 
pects. Over and above the use of specification, and 
design and implementation techniques, human fac- 
tors and software management should also be ad- 
dressed. Well-engineered software provides the 
service required by its users. Such software should 
be produced in a cost-effective way and should be 
appropriately functional, maintainable, reliable, effi- 
cient, and provide a relevant user interface (Press- 
man, 2000a; Shneiderman, 1992; Whitten, Bentley, 
& Dittman, 2001). 

There are two major development methodologies 
that are used to develop IS applications: the tradi- 
tional systems development methodology and the 
object-oriented (OO) development approach. 
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The traditional systems approaches have the 
following phases: 

• Planning: this involves identifying business 
value, analysing feasibility, developing a work 
plan, staffing the project, and controlling and 
directing the project. 

• Analysis: this involves information gathering 
( requirements gathering), process modeling 
and data modeling. 

• Design: this step is comprised of physical 
design, architecture design, interface design, 
database and file design, and program design. 

• Implementation: this step requires both con- 
struction and installation. 

There are various 00 methodologies. Although 
diverse in approach, most 00 development method- 
ologies follow a defined system development life 
cycle. The various phases are intrinsically equiva- 
lent for all of the approaches, typically proceeding as 
follows: 

• OO Analysis Phase (determining what the 
product is going to do) and extracting the 
objects (requirements gathering), OO de- 
sign phase, OO programming phase (imple- 
mented in appropriate OO programming lan- 
guage), integration phase, maintenance 
phase and retirement (Schach, 2002). 

One phase of the SE life cycle that is common to 
both the traditional development approach and the 
OO approach is requirements gathering. Require- 
ments’ gathering is the process of eliciting the 
overall requirements of the product from the cus- 
tomer (user). These requirements encompass infor- 
mation and control need, product function and be- 
havior, overall product performance, design and 
interface constraints, and other special needs. The 
requirements-gathering phase has the following pro- 
cess: requirements elicitation; requirements analysis 
and negotiation; requirements specification; system 
modeling; requirements validation; and requirements 
management (Pressman, 2000a). 

Despite the concerted efforts to develop a suc- 
cessful process for developing software, Schach 
(2002) identifies the following pitfalls: 



Traditional engineering techniques cannot be 
successfully applied to software development, 
causing the software depression (software cri- 
sis). Mullet (1999) summarizes the software 
crisis by noting that software development is 
seen as a craft rather than an engineering 
discipline. The approach to education taken by 
most higher education institutions encourages 
that “craft” mentality; lack of professionalism 
within the SE world (e.g. , the failure of treating 
an operating system’s crash as seriously as a 
civil engineer would treat the collapse of a 
bridge); the high acceptance of fault tolerance 
by software engineers (e.g., if the operating 
system crashes; reboot hopefully with minimal 
damage); the mismatch between hardware and 
software developments. Hardware and soft- 
ware developments are both taking place at a 
rapid pace but independently of each other. 
Both hardware and software developments 
have a maturation time to be compatible with 
each other, but by that time everything has 
changed. The final problem for software engi- 
neers is the constant shifting of the goalposts. 
Customers initially think they want one thing 
but frequently change their requirements. 
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Notwithstanding these pitfalls, Pressman (2000b) 
argues that SE principles always work. It is never 
inappropriate to stress the principles of solid problem 
solving, good design, thorough testing, control of 
change, and emphasis on quality. 

The Web is an intricate and complex combination 
of technologies (both hardware and software) that 
are at different levels of maturity. Engineering Web- 
based EC software, therefore, has its own unique 
challenges. In essence, the network becomes a 
massive computer that provides an almost unlimited 
software resource that can be accessed by anyone 
with a modem (Pressman, 2000a). We illustrate 
these intricacies in Figure 1, which is a representa- 
tion of a home computer that is attached to the 
Internet. It depicts the underlying operating system 
(the base platform), the method of connection to the 
Internet (dial up, the technology that supports Web 
activities), browser, an example of a Web communi- 
cation language (HTML), and additional technology 
that may be required to be Web active. 
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Figure 1. EC Web application platform (adapted from Hurst & Gellady, 2000) 



Add-ons: cookies, plug-ins, Java, Java 
script (all of whicfi have their own error 
messages, liable to appear at any time), 
the familiar DNS error, ad banners, etc. 



HTML: links, graphics, frames, forms, tables, etc. 

Browser: bookmarks, history file, Back/Forward/Home/Reload 
buttons, etc. 



TCP/IP: running dial-up network to call the ISP, retrying if necessary, etc. 



Operating System: opening and closing of windows, menus, dialog boxes, clicking “Start” to shut 
down the computer, etc. 



All the aspects of Figure 1 will support EC soft- 
ware in some way or another. An SE defect in any of 
these five layers would create a problem. For ex- 
ample, if the operating system is poorly engineered, 
the technology that sits on this platform will give 
piecemeal functionality at best. The problem is fur- 
ther complicated by piecemeal “patch” solutions. 
These piecemeal solutions can severely affect the 
usability of the Web, for example by giving cryptic 
error messages, installing add-ons that affect some 
unknown setting that the users do not understand, or 
installing add-ons that require a particular bit of 
hardware or software to be present. 

The View of HCI Advocates 

Human-computer interaction is concerned with the 
way in which computers can be used to support 
human beings engaged in particular activities. FICI 
thus involves the specification, design, implementa- 
tion, and evaluation of interactive software systems 
in the context of the user’s task and work (Preece, 
Rogers, & Sharp, 2002; Preece, Rogers, Sharp, 
Benyon, Holland, & Carey, 1994; Shneiderman, 1998). 

An aspect related to HCI is interaction design. 
Interaction design is the process of designing interac- 
tive products to support people in their everyday and 
work lives. In particular, it is about creating user 
experiences that enhance and extend the way people 



work, communicate, and interact (Preece et al., 

2002 ). 

As stated earlier, it is the users’ experience that 
affects their activities on the Web. The advocates 
of HCI are intent on discovering the key to success- 
ful user experiences and so the concept of usability 
is intensively investigated in HCI. The ISO 924 1-11 
standard (1999) defines usability as the following: 
the extent to which a product can be used by a 
specified set of users, to achieve specified goals 
(tasks) with effectiveness, efficiency, and satisfac- 
tion in a specified context of use. 

INTEGRATED USABILITY 

Several researchers have produced sets of generic 
usability principles, which can be used in improving 
software (e.g., Mayhew, 1999; Preece et al., 1994; 
Shneiderman, 1998, 2000). Some of these usability 
principles are: learnability, visibility, consistency 
and standards, flexibility, robustness, responsive- 
ness, feedback, constraints, mappings, affordances, 
stability, simplicity, help, and documentation. Un- 
fortunately, the definitions of such design and us- 
ability principles are mostly too broad or general, 
and, in some cases, very vague. Some of these 
principles have been adapted for EC (see for ex- 
ample Badre, 2002). It has been shown repeatedly 
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that general usability advice is not effective on its 
own when designing systems for a context-specific 
environment. Therefore, it is generally difficult for a 
non-usability expert or a novice to apply these 
principles in a particular domain and situation, taking 
into account the unique factors that give rise to 
problems in that domain. 

We argue that usability advice should be linked to 
a context-specific environment. For example, if a 
designer is interested in enticing surfers to stop 
browsing and engage in transactions, the designer 
would be well advised to make different design 
choices for an Internet banking site than for an 
online library. So, the design of a site for Pick ‘n Pay 
(supermarket chain), ABSA (commercial bank), 
and the University of South Africa’s library should 
therefore be approached differently. 

The HCI proponents also propose certain life 
cycle models . Williges , Williges , and Elkerton (1987), 
for example, have produced a model of development 
to rectify some of the problems in the “classic” life 
cycle model of SE. In this approach, HCI principles 
and interface design drive the whole process. Other 
such life cycle models include the Star model of 
Hartson and Hix (1989), the Usability Engineering 
life cycle of Mayhew (1999), and the Interaction 
Design model of Preece et al. (2002). These meth- 
ods also introduce various strategies for the develop- 
ment of effective user interfaces. The argument for 
putting forward these alternative development mod- 
els is that by spotting user requirements early on in 
the development cycle, there will be less of a de- 
mand for code generation and modification in the 
later stages of systems development. 

FUTURE TRENDS 

Standards can serve as good anchor points to focus 
the dialogues and collaborative activities. However, 
the existing standards are rather inconsistent and 
thus confusing. More efforts should be invested to 
render these tools more usable and useful. Specifi- 
cally, it is worthy to develop implementation strate- 
gies for situating or localizing the standards so that 
they can be applied effectively in particular contexts 
(Law, 2003). 



CONCLUSION 

Both the SE proponents and the HCI proponents 
have a point with regard to their approach. SE 
proponents try to produce a workable solution and 
HCI proponents try to develop a usable solution. The 
two approaches are not mutually exclusive. A work- 
able solution may not be a usable solution, and a 
usable solution may not be a workable solution. The 
problem is that the HCI advocates are isolated from 
their SE colleagues, who in turn ignore the HCI 
advocates. The HCI advocates use a “blinder ap- 
proach” in their attempt to develop software by only 
focusing on the HCI aspects of the design of soft- 
ware, while the SE developers are concerned with a 
satisfactory solution. The aspects of Figure 1 will in 
effect influence the HCI advocates’ approaches as 
well as the SE advocates’ approaches for designing 
software for the Web. The uncertainty aspect has to 
be factored into the design process. 
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KEY TERMS 

Information Systems: First known as business 
data processing (BDP) and later as management 
information systems (MIS). The operative word is 
“system” because it combines technology, people, 
processes, and organizational mechanisms for the 
purpose of improving organizational performance. 

Interaction Design: The process of designing 
interactive products to support people in their every- 
day and work lives. 

ISO 9241-11: This part of ISO 9241 introduces 
the concept of usability but does not make specific 
recommendations in terms of product attributes. 
Instead, it defines usability as the “extent to which a 
product can be used by specified users to achieve 
specified goals with effectiveness, efficiency and 
satisfaction in a specified context of use.” 

Requirements’ Gathering: The process of 
eliciting the overall requirements of a product from 
the customer. 

Software Engineering: Concerned with the 
development of software systems using sound engi- 
neering principles for both technical and non-techni- 
cal aspects. Over and above the use of specification, 
design and implementation techniques, human fac- 
tors and software management should also be ad- 
dressed. 

Usability: The ISO 9241-1 1 standard definition 
of usability identifies three different aspects: (1) a 
specified set of users, (2) specified goals (asks) 
which have to be measurable in terms of effective- 
ness, efficiency, and satisfaction, and (3) the context 
in which the activity is carried out. 
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INTRODUCTION 

With the advent of the electronic mail system in the 
1970s, a new opportunity for direct marketing using 
unsolicited electronic mail became apparent. In 1978, 
Gary Thuerk compiled a list of those on the Arpanet 
and then sent out a huge mailing publicising Digital 
Equipment Corporation (DEC — now Compaq) sys- 
tems. The reaction from the Defense Communica- 
tions Agency (DC A), who ran Arpanet, was very 
negative, and it was this negative reaction that 
ensured that it was a long time before unsolicited e- 
mail was used again (Templeton, 2003). As long as 
the U.S. government controlled a major part of the 
backbone, most forms of commercial activity were 
forbidden (Hayes, 2003). However, in 1993, the 
Internet Network Information Center was priva- 
tized, and with no central government controls, 
spam, as it is now called, came into wider use. 

The term spam was taken from the Monty Py- 
thon Flying Circus (a UK comedy group) and their 
comedy skit that featured the ironic spam song sung 
in praise of spam (luncheon meat) — “spam, spam, 
spam, lovely spam” — and it came to mean mail that 
was unsolicited. Conversely, the term ham came to 
mean e-mail that was wanted. Brad Templeton, a 
UseNet pioneer and chair of the Electronic Frontier 
Foundation, has traced the first usage of the term 
spam back to MUDs (Multi User Dungeons), or 
real-time multi-person shared environment, and the 
MUD community. These groups introduced the term 
spam to the early chat rooms (Internet Relay Chats). 

The first major UseNet (the world’s largest 
online conferencing system) spam sent in January 
1994 and was a religious posting: “Global alert for all: 
Jesus is coming soon.” The term spam was more 
broadly popularised in April 1994, when two law- 
yers, Canter and Siegel from Arizona, posted a 
message that advertized their information and legal 



services for immigrants applying for the U.S. Green 
Card scheme. The message was posted to every 
newsgroup on UseNet, and after this incident, the 
term spam became synonymous with junk or unso- 
licited e-mail. Spam spread quickly among the 
UseNet groups who were easy targets for spammers 
simply because the e-mail addresses of members 
were widely available (Templeton, 2003). 

BACKGROUND 

At present, the practice of spamming is pervasive; 
however, due to the relative recent nature of the 
problem and due to its fast changing nature, the 
discussion about the topic has been limited to aca- 
demic literature. While in computer science litera- 
ture there has been a concentration of work on the 
technical features and solutions designed to prevent 
or ameliorate the practice (Androutsopoulos et al., 
2000; Gburzynski & Maitan, 2004; Goodman & 
Rounthwaite, 2004), the more general scientific 
discussion has been provided by a few scientific 
commentators (Gleick, 2003; Hayes, 2003), and the 
few books written on the subject (Schwartz & 
Garfinkel, 1998) have become outdated in a rela- 
tively short span of time. In other academic areas, 
there is some literature available concerning the 
legal implications of spam (Crichard, 2003) and the 
marketing dimension of spamming (Nettleton, 2003 ; 
Sipior et al., 2004); however, these, too, have suf- 
fered from the fast changing and global scope of the 
problem. Furthermore, aspects such as the social 
and political implications of spamming have been 
restricted to journalistic commentary in newspaper 
articles (BBC News, 2003, 2004; Gleick, 2003; 
Krim, 2004). In order to provide a broader focus in 
this article, therefore, the authors have supplemented 
this literature with interviews conducted with spe- 
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cialists in the field in order to provide the most up-to- 
date information, including interviews with Enrique 
Salem, CEO of Brightmail; Mikko Hyponnen of F- 
Secure; and Steve Linford of the Spamhaus Project. 

However, while the broader issues of spamming 
have been discussed in the general literature re- 
viewed, in the area of human-computer interaction, 
there has been a paucity of discussion, although this 
may change with the wider take-up of mobile de- 
vices with their context awareness. Notable articles 
that have touched on related issues in the human- 
computer interaction field have included those that 
have considered issues of privacy (Ackerman et al., 
2001) and usability in particular difficulties with 
using computer technology (Kiesler et al., 2000). 
However, this is not to say that spamming does not 
play a role in reversing the convenience that many 
experience when using e-mail on their desktop, 
laptop, or mobile device, and it is often the most 
vulnerable that are affected adversely by spamming 
practice. 

The mass appeal and use of electronic mail over 
the Internet has brought with it the practice of 
spamming or sending unsolicited bulk e-mail adver- 
tising services. This has become an established 
aspect of direct marketing, whereby marketers can 
reach many millions of people around the world with 
the touch of a button. However, this form of direct 
marketing or spamming, as it has come to be called, 
has become an increasing problem for many, wast- 
ing people’ s time as they delete unwanted e-mail and 

Graph 1. The escalation of spam worldwide, 
2001 to July 2004 (Source: Brightmail) 




slowing down the movement of electronic traffic 
over local and wide area networks (Salem interview, 
2004; Goodman & Rounthwaite, 2004). 

The scale of the problem has become particularly 
concerning in recent months; unsolicited e-mail — or 
spam — currently accounts for 65% of all e-mail 
received in July 2004 (Brightmail, 2004; Enrique 
Salem, CEO of Brightmail, interview 2004). Of the 
70 million e-mails that Brightmail filtered in Septem- 
ber 2003 alone, 54% was unsolicited, and that per- 
centage is increasing year after year (see Graph 1). 
But although there are a number of different ways to 
filter unwanted e-mail, which may lead to a signifi- 
cant reduction of spam in the short term, many 
experts in the field are concerned that spam will 
never be completely eradicated (Hyponnen, F-Se- 
cure interview, 2004; Linford, Spamhaus interview, 
2004). 

CRITICAL ISSUES OF SPAM 

So who are the spammers? The spammers can be 
identified in three main groups: (1) legitimate com- 
mercial direct marketers, who want to make com- 
mercial gain from sending bulk e-mails about prod- 
ucts and services; (2) criminal groups, including 
fraudsters, who are using spam to “legitimise”’ their 
activities and to defraud others (Gleick, 2003 ; Levy, 
2004; Linford interview, 2004); and (3) disaffected 
individuals — crackers — who want to disrupt Inter- 
net services and who, in many cases, may have 
inside information about how the systems are struc- 
tured. The criminal group is potentially the most 
dangerous, and while spam is not an illegal activity, 
this practice is set to spread to the criminal fraternity 
in China, Russia, and South America. This trend is 
becoming more widespread with the ease of obtain- 
ing spam kits over the Internet, which allows the 
potential spammer to set up quickly (Thomson, 2003). 

Increasingly, illegitimate spammers, fraudsters, 
and crackers are joining forces to introduce fraud 
schemes such as the 419 scam and phishing (sending 
e-mails as if they came from trusted organisations) 
to convince unsuspecting victims to reveal sensitive 
personal information; in particular, to gain information 
about users’ credit card information or to gain access 
details of online transaction services (Levy, 2004). 
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WAYS OF COMBATING SPAM 

In the light of this increasing problem, a series of 
attempts, both technological and non-technological, 
have been made to try to combat the annoyance of 
full mailboxes in order to counter the heavyweight of 
unwanted e-mail traffic and to deter criminal activity 
(Goodman & Rounthwaite, 2004). Hand in hand with 
the push for tighter legislation to tackle the problem, 
several technical solutions have been deployed, and 
new ones are being proposed. 

Before an e-mail arrives in your mailbox it passes 
through a mail server, which is either hosted within 
your organization or through an Internet Service 
Provider (ISP). Filtering out spam at this early stage 
(pre-receipt) before the message arrives at your 
machine is obviously desirable, and many IT depart- 
ments and ISPs already have installed anti-spam 
software on their servers. Tools also exist that are 
user-based and filter out e-mail that already has 
arrived at your mailbox (post-receipt). Due to the 
flood of spam that is relentlessly sent to us, for now, 
it is probably best to have filtering tools both at the 
server and the user ends. 

Two problems that need to be addressed by any 
spam filtering system are the rates of false positives 
and false negatives. A false positive is a mail mes- 
sage that the filter tags as spam but is actually ham, 
while a false negative is a mail message that the filter 
tags as ham but is actually spam. Having no filter at 
all is the case of 0% false positives and 100% false 
negatives, and a filter that blocks everything is one 
with 100% false positives and 0% false negatives. 
Ideally, we want 0% false positive (i.e., all ham gets 
through the filter) and 0% false negatives (i.e. all 
spam is blocked). 

The methods for combating spam include the 
following, which are summarized in tabular form (see 
Table 1). 

• Blocklisting 

• Protocol change 

• Economic solutions 

• Computational solutions 

• E-mail aliasing 

• Sender warranted e-mail 

• Collaborative filtering 

• Rule-based solutions 



• Statistical solutions 

• Legislative solutions 



s 



All these methods for combating spam impede 
the usability of e-mail and necessitate extra techni- 
cal and administrative support; however, the safety 
and security for individuals using Internet and e- 
mail-based services is reliant upon controlling the 
misuse of the systems; therefore, these methods 
are a trade-off between free and open access and 
secure and safe systems. Of course, there are 
social and political implications for employing these 
preventative methods; however, there is clearly a 
need to address these failings using more than one 
listed method. 

There is clearly a need to consider the problem 
of spam in the human-computer interaction field, 
particularly relating to issues of increasing usability 
for more vulnerable user groups, such as those with 
particular disabilities, frailties, and illnesses, who 
may be particularly susceptible to particular scams 
and fraudulent deceits. 



FUTURE TRENDS 

Future areas of development for spamming may 
center upon relatively unprotected mobile phones 
and devices (Sipior et al., 2004; Syntegra, 2003). To 
date, the practice known as wardriving, where 
individuals drive around until they detect wireless 
connectivity and then bombard the unprotected 
network with spam, provides a real indication about 
the potential dangers of spamming for the future. 
Another concerning trend has been the use of spam 
to send out viruses (Stewart, 2003), the SoBig virus 
attack, for example, that used this method. 

In addition, the cheap and easy availability of 
spam kits that provide mailing lists and the spamming 
software on the Internet have spread the practice to 
new territories, in particular to China, Russia, and 
South America, making the practice more wide- 
spread and leading to an escalation in the rate of 
spamming. 

Other adaptations of the spamming practice 
recently have included the use of malicious code, 
using worms and trojans spam relays are created; 
the MyDoom worm operated in this way, installing 
proxies that spammers could then exploit. 
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Table 1. Methods for combating spam 



Solution 


Method 


Benefits 


Limitations 


Block listing 


Use of lists of IP addresses of 
known sources of spam (e.g., 
SBL and RBL) 


Blocks a significant volume 
of spam 


Cannot block all spam and 
needs to be updated on a 
regular basis 


Protocol change 


To provide a method of tracking 
the source of an e-mail 


Will help to identify 
spammers and add spam 
addresses to block lists 


Will not prevent spam as 
such 


Economic solutions 


Impose a fee for sending e-mail 


Will deter spammers from 
sending large volumes of 
junk e-mail 


Will be difficult and costly 
to implement a worldwide 
standard for collecting the 
fee 


Computational 

solutions 


Impose an indirect payment in 
the form of a machine 
computation prior to sending e- 
mail 


It is a viable alternative to 
the economic solution, 
without needing the 
infrastructure to collect a fee 


A protocol involving 
cryptographic techniques 
will need to be put in place 
and software developed to 
implement the method 


E-mail aliasing 


Set up e-mail aliases for 
different groups of people with 
different acceptance criteria 


Will reduce spam through 
an authentication process 


This method involves an 
extension to current e-mail 
servers and the management 
of e-mail aliases 


Sender warranted 
e-mail 


Use of a special header to 
certify the e-mail as valid 


No need for additional 
software or e-mail protocol 


Will probably not deter 
spammers if widely adopted, 
and wide licensing of the 
technology will be 
problematic 


Collaborative 

filtering 


Communities collaborate to 
fight spam using a collaborative 
tool that is an add-on to e-mail 
software 


Possible eradication of large 
volumes of spam through 
collaborative reporting of 
spam 


Still vulnerable to random 
changes in spam e-mail, and 
there are problems with 
scalability of this method 


Rule-based solutions 


These filters maintain a 
collection of patterns to be 
matched against incoming 
spam, as in SpamAssassin 


It is easy to install and 
effective in blocking a large 
percentage of spam, and in 
the case of SpamAssassin, it 
is free 


It needs a lot of tuning and 
should be combined with 
other methods to filter out a 
larger volume of spam 


Statistical solutions 


Often deployed as a post-receipt 
spam filter using Bayesian text 
classification to tag e-mail as 
spam or ham 


It is very effective and also 
adaptive, so it is hard to fool 


Most effective when used 
with other pre-receipt filter 
systems 


Legislative solutions 


National and global legislation 
to enforce anti-spam laws 


Prosecution of individual 
spammers 


Problems of enforcement, 
not least due to crossing of 
different jurisdictional 
boundaries 



The increase in the technical sophistication of 
spammers also is evidenced by the use of so-called 
reputation attacks, where spammers use a worm to 
launch a denial of service attack against anti- 
spamming organisations. One such example was the 
Mimail attacks (Levy, 2004) that specifically tar- 
geted anti-spam organisations seeking to block out 
spam. Clearly, spamming is becoming more refined 
and will evolve to adapt to any perceived weak- 
nesses in network security. 

CONCLUSION 

This article has highlighted the scale and depth of the 
spamming problem, and while many are committed 



to the eradication of all Internet-based fraud and 
illegitimate activity, it seems unlikely that spam will 
completely disappear. It is more likely that the 
practice will continue to evolve and transmute to 
adapt to new vulnerabilities in the systems and to 
exploit users who are not fully aware of how they 
can be exploited through impersonations of familiar 
Web sites and services. With the current force 
behind the anti-spam movement gaining momentum, 
we can expect to see less spam in the future, but only 
with preventative measures such as those described 
in this article being put in place. In the near future 
however, the cat-and-mouse game among spammers 
and anti-spammers is set to continue. In particular, 
the new routes for spammers clearly lie in reaching 
users through mobile devices, which need to become 
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better protected by virus and spam software, if these 
cyber crimes are to be controlled and ameliorated. 
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E-Mail Aliasing: Where an individual has more 
than one e-mail address, the practice allows the user 
to use different addresses for different tasks; for 
example, one address for Internet communications 
and another for business. 

False Negatives: A false negative is a mail 
message that the filter tags as ham but is actually 
spam. 

False Positives: A false positive is a mail 
message that the filter tags as spam but is actually 
ham. 

Phishing: Short for password harvest fishing, it 
is the process of impersonating another trusted 
person or organization in order to obtain sensitive 
personal information, such as credit card details, 
passwords, or access information. 



Sender Warranted E-Mail: This method al- 
lows the sender to use a special header to certify that 
the e-mail is genuine. The process could help to 
prevent spam scams. 

Spam: Otherwise termed unsolicited e-mail, un- 
solicited commercial e-mail, junk mail, or unwanted 
mail, it has been used in opposition to the term ham, 
which is wanted e-mail. The term was developed 
from a Monty Python comedy sketch depicting spam 
as useless and ham as lovely, albeit in ironic terms. 

Wardriving: Also termed WiLDing — Wireless 
Lan Driving, it is an activity whereby individuals 
drive around an area detecting Wi-Fi wireless net- 
works, which they then can access with a laptop. 
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INTRODUCTION 

Spam, undesired and usually unsolicited e-mail, has 
been a growing problem for some time. A 2003 
Sunbelt Software poll found spam (or junk mail) has 
surpassed viruses as the number-one unwanted 
network intrusion (Townsend & Taphouse, 2003). 
Time magazine reports that for major e-mail provid- 
ers, 40 to 70% of all incoming mail is deleted at the 
server (Taylor, 2003), and AOL reports that 80% of 
its inbound e-mail, 1.5 to 1.9 billion messages a day, 
is spam the company blocks. Spam is the e-mail 
consumer’ s number-one complaint (Davidson, 2003). 
Despite Internet service provider (ISP) filtering, up 
to 30% of in-box messages are spam. While each of 
us may only take seconds (or minutes) to deal with 
such mail, over billions of cases the losses are 
significant. A Ferris Research report estimates spam 

2003 costs for U.S. companies at $ 10 billion (Bekker, 
2003). 

While improved filters send more spam to trash 
cans, ever more spam is sent, consuming an increas- 
ing proportion of network resources. Users shielded 
behind spam filters may notice little change, but the 
Internet transmitted-spam percentage has been 
steadily growing. It was 8% in 2001, grew from 20% 
to 40% in 6 months over 2002 to 2003, and continues 
to grow (Weiss, 2003). In May 2003, the amount of 
spam e-mail exceeded nonspam for the first time, 
that is, over 50% of transmitted e-mail is now spam 
(Vaughan-Nichols, 2003). Informal estimates for 

2004 are over 60%, with some as high as 80%. In 
practical terms, an ISP needing one server for 
customers must buy another just for spam almost no 
one reads. This cost passes on to users in increased 
connection fees. 

Pretransmission filtering could reduce this waste, 
but creates another problem: spam false positives, 
that is, valid e-mail filtered as spam. If you acciden- 



tally use spam words, like enlarge, your e-mail may 
be filtered. Currently, receivers can recover false 
rejects from their spam filter’s quarantine area, but 
filtering before transmission means the message 
never arrives at all, so neither sender nor receiver 
knows there is an error. Imagine if the postal mail 
system shredded unwanted mail and lost mail in the 
process. People could lose confidence that the mail 
will get through. If a communication environment 
cannot be trusted, confidence in it can collapse. 

Electronic communication systems sit on the 
horns of a dilemma. Reducing spam increases deliv- 
ery failure rate, while guaranteeing delivery in- 
creases spam rates. Either way, by social failure of 
confidence or technical failure of capability, spam 
threatens the transmission system itself (Weinstein, 

2003) . As the percentage of transmitted spam in- 
creases, both problems increase. If spam were 99% 
of sent mail, a small false-positive percentage be- 
comes a much higher percentage of valid e-mail that 
failed. The growing spam problem is recognized 
ambivalently by IT writers who espouse new Baye- 
sian spam filters but note, “The problem with spam 
is that it is almost impossible to define” (Vaughan- 
Nichols, 2003, p. 142), or who advocate legal solu- 
tions but say none have worked so far. The technical 
community seems to be in a state of denial regarding 
spam. Despite some successes, transmitted spam is 
increasing. Moral outrage, spam blockers, spamming 
the spammers, black and white lists, and legal re- 
sponses have slowed but not stopped it. Spam 
blockers, by hiding the problem from users, may be 
making it worse, as a Band-Aid covers but does not 
cure a systemic sore. Asking for a technical tool to 
stop spam may be asking the wrong question. If 
spam is a social problem, it may require a social 
solution, which in cyberspace means technical sup- 
port for social requirements (Whitworth & Whitworth, 

2004) . 
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BACKGROUND 
Why Spam Works 

Spam arises from the online social situation technol- 
ogy creates. First, it costs no more to send a million 
e-mails than to send one. Second, “hits” are a 
percentage of transmissions, so the more spam sent 
means more sender profit. Flence, it pays individuals 
to spam. The logical goal of spam generators is to 
reach all users to maximize hits at no extra cost. Yet 
the system cannot sustain this. With 23 million 
businesses in America alone, if each sent just one 
unsolicited message a year to all users, that is over 
63,000 e-mails per person per day. Spam seems the 
electronic equivalent of the “tragedy of the com- 
mons” (Flardin, 1968), where some farmers, each 
with some cows and land, live near a common grass 
area. The tragedy is that if the farmers calculate 
their benefits, they all graze the commons, which is 
destroyed from overuse. In this situation, individual 
temptation can undermine a public-good commons. 

For spam, the public good is free online commu- 
nication, and the commons is the wires, storage, and 
processors of the Internet. The individual temptation 
is to use the commons for personal gain. E-mail 
creates value by exchanging meaning between 
people. As spam increases, e-mail gives less mean- 
ing for more effort, that is, less value. Losses include 
wasted processing, storage, and lines; “ignore time” 
(time to reject spam); antispam software costs; time 
to resolve spam false positives ; time to confirm spam 
challenges; important messages lost by spam; and 
unknown lost opportunity costs from messages not 
sent because spam raises the user cost to send a 
message (Reid, Malinek, Stott, & T., 1996). E-mail 
lowered this communication threshold, but spam 
makes communication harder by degrading the e- 
mail commons . If half of Internet traffic is spam, the 
Internet is half wasted, and for practical purposes, 
half destroyed. Spam seems to be an electronic 
tragedy of the commons. 

SOME SPAM RESPONSES 

If spam is a traditional social problem in electronic 
clothes, why not use traditional social responses? 



Ignore It 

One answer to spam is to ignore it: After all, if no one 
bought, spam would stop. Flowever, a “handful of 
positive responses is enough to make a mailing pay 
off, and there will always be a handful of suckers out 
there” (Ivey, 1998, p. 15). There are always spam 
responders; a new one is born on the Internet every 
minute. 

Ethics 

Online society seems unlikely to make people more 
ethical than they are in physical society, so it seems 
unlikely spammers will “see the light” any time soon. 

Barriers 

Currently the most popular response to spam is spam 
filters, but spammers need only 100 takers per 10 
million requests to earn a profit (Weiss, 2003), much 
less than a 0.01% hit rate. So even with 99.99% 
successful spam blockers, spam transmission will 
increase. 

Revenge 

One way users handled companies faxing annoying 
unsolicited messages was by “bombing” them with 
return faxes, shutting down their fax machines. For 
e-mail, ISPs, not senders, are registered. If we 
isolate ISPs that allow spam, this penalizes valid 
users as well as spammers. Lessig (1999) argued 
before the U.S. Supreme Court for a bounty on 
spammers, “like bounty hunters in the Old West” 
(Bazeley, 2003). However, the cyberspace “Wild 
West” is not inside America, nor under U.S. courts. 
And do we really want an online vigilante society? 

Third-Party Guarantees 

Another approach is for a trusted third party to 
validate all e-mail. The Tripoli method requires all e- 
mails to contain an encrypted guarantee from a third 
party that it is not spam (Weinstein, 2003). However, 
custodian methods require significant coordination 
and raise Juvenal’s question, “Quis custodiet ipsos 
custodies [Who will watch our watchers]?” Will 



560 



Spam as a Symptom of Electronic Communication Technologies that Ignore Social Requirements 



stakeholders like the Direct Marketing Association 
or Microsoft guarantee against spam? If spam is in 
the eye of the beholder, such companies may con- 
sider their spam not spam at all. 

Legal Responses 

Why not just pass a law against spam? This approach 
may not work for several reasons (Whitworth & 
deMoor, 2003). First, virtual worlds work differently 
from the physical world. Applying laws online cre- 
ates problems; for example, financial and health-care 
organizations by law must archive all communica- 
tions so must not only receive spam, but also store it 
(Paulson, 2003). It is difficult to stretch physical law 
into cyberspace (Samuelson, 2003). Legal prosecu- 
tions require physical evidence, an accused, and a 
plaintiff, yet spam evidence is in a malleable 
cyberspace, e-mail sources are easily “spoofed” to 
hide the accused, and the plaintiff is everyone with an 
e-mail address. What penalties apply when each 
individual loses so little? Second, virtual worlds change 
faster than physical worlds. Spam can mutate in 
form, for example, Internet messaging spam or 
“spim.” Any spam variant would require new laws, 
yet while society takes years to pass laws, the 
Internet can change monthly. Third, in cyberspace, 
code is law (Mitchell, 1995). Software can make 
spammers anonymous or generate new addresses so 
quickly that bans have no effect. Finally, laws are 
limited by jurisdiction; for example, state laws against 
telemarketers were ineffective against out-of-state 
calls, and the U.S. nationwide do-not-call list is 
ineffective against overseas calls. U.S. law applies 
to U.S. soil, but spam can come from any country. 
Traditional law seems too physically constrained, too 
slow, and too impotent to deal with the spam chal- 
lenge. As Ken Korman (2003, p. 3) concedes, “Though 
legislative efforts to control spam continue, it is 
unlikely that new laws will have any real effect on the 
problem.” PC World adds, “By all accounts, CAN- 
SPAM has failed to stop the e-mail inundation” 
(Spring, 2004, p. 24). 

Challenge Systems 

Challenge systems, like MailBlocks (2003), ask e- 
mail senders, “Are you really a person? If so, type the 



number shown in this graphic.” Since most spam is 
computer generated, and most spammers will not 
accept replies (lest they be spammed in return), 
such methods work well, but users communicate 
twice to receive once. 



s 



An E-Mail Charge 

One way to change the communication environ- 
ment is to charge for e-mail. This would hit 
spammer’s pockets, but also reduce general usage 
by increasing the communication threshold (Kraut, 
Shyam, Morris, Telang, Filer, & Cronin, 2002). 
What would be the purpose of a charge, however 
small? An Internet toll would add no new service as 
e-mail already works without such charges. Its sole 
purpose would be to punish spammers by slowing 
the flow for everyone. A variant is that all senders 
compute a time-costly function (Dwork & Naor, 
1993), but the effect is still to increase the transmis- 
sion cost. Increasing across-the-board e-mail costs 
seems like burning down your house to prevent 
break-ins. If e-mail were metered, we would all pay 
for something already paid for. Who would receive 
each payment? If senders paid receivers, each e- 
mail would be a money transfer. The cost of admin- 
istering such a system could outweigh its benefit, 
and who would set the charge rate? If e-mail 
providers took the charge, it would be an e-mail tax, 
but what global entity can legitimately claim it? 
Making the Internet a field of profit could open it to 
corruption. Spam works because e-mail costs so 
little, but that is also why the Internet works. Fast, 
easy, and free communication has benefited us all. 
To raise the communication threshold by charging 
for what we already have seems retrogressive. A 
solution that reduces spam but leaves the Internet 
advantage intact is to design for fair communication 
in the first place. 



LEGITIMATE COMMUNICATION 

Spam is an opportunity as well as a threat. The 
challenge is to close the social-technical gap 
(Ackerman, 2000) between society and technol- 
ogy. Traditional social methods, like the law, are 
struggling to do this. An alternative is for technol- 
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ogy to support society rather than being impartial to 
social needs. The Internet was once thought to be 
innately ungovernable, but it could just as easily 
become a system of perfect regulation and control 
(Lessig, 1999). If in cyberspace code, not law, 
makes the rules, it makes sense to design social 
software to support legitimate interaction, that is, 
social exchanges that are both fair to individuals and 
beneficial to the social group (Whitworth & 
Whitworth, 2004). This raises the question of whether 
spam is legitimate communication. 

Is Spam Legitimate Communication? 

Spam is unfair because senders have all the trans- 
mission choices, just like telemarketers who have 
your home phone number but invariably refuse to 
give you theirs. They call you at home, but you 
cannot call them at home. Spammers waste others’ 
time, but this is irrelevant to them because it is not 
their loss. Y et the loss is still real, and it is unfair that 
those who cause it do not bear it, that those who 
suffer spam are not its creators. 

Spam is unprofitable to society if its total losses 
exceed its total profit. If 90% of people spammed do 
not buy, do their losses balance the gains of the 10% 
who buy? What if 99.9% do not buy? There is a 
saturation point when spam’s losses outweigh its 
benefits. We seem well past that point already. By 
one estimate, it costs about $250 to send a million e- 
mails, which cause about $2,800 in lost wages to 
society in general (Emery, 2003). Spammers steal 
time, which in today’s world equates to money. 
Some see it as a mild crime, like littering on the 
Internet, but when litter blocks the streets, there is 
concern. Over millions of people the productivity 
loss is significant, as a cyber thief taking a few cents 
from millions of bank accounts can steal a sizable 
sum. 

If spam is unfair to individuals and harmful to 
online society, it is illegitimate communication on 
two counts. 

Communication Rights 

The method of legitimacy analysis (Whitworth & 
deMoor, 2003) asks, Who owns the elements of e- 
mail communication: the messages, channels, and 
addresses? 



Who Owns E-Mail Messages? 

From a social-rights perspective, e-mail is a request, 
not a requirement, to receive a message. Receivers 
should be able to refuse ownership after reading it, 
perhaps via an e-mail toolbar rejection button. The 
receiver does not own a rejected message (by 
definition), and the transmission system does not 
own it, so it belongs to the sender who created it and, 
as with postal mail, should be returned to the sender. 
This does not happen because e-mail was designed 
as a forward-and-forget system, so replies to 
spammers may go nowhere (Cranor & LaMacchia, 
1998), one reason the spam-the-spammer approach 
does not work (Held, 1998). 

The social logic that communication is a two-way 
process implies that receiving back rejected e-mail 
should be a necessary condition of transmission. 
Rejected spam would then return down the sender’ s 
communication lines to their computer, creating 
spammer disk and channel costs. It seems ineffi- 
cient to return rejected messages that can be deleted 
at delivery, but supporting social accountability in the 
long term both reduces waste and tells senders an e- 
mail was rejected. Currently, spammers do not know 
who reads their messages and who does not. If 
rej ected e-mail were returned, it would pay spammers 
to reduce their lists and give them the information 
needed to do so. The right to reject e-mail is a social 
requirement. Implementing it is an engineering prob- 
lem. The e-mail transmission system controls both 
the pieces of the communication game and the board 
itself. It should be able to enforce a rule that to send 
into the system, one must also receive from it. 

Who Owns Communication Channels? 

Current systems give any sender the automatic right 
to open a channel to another. Yet society gives no 
such right to communicate, but rather the right to be 
left alone (Warren & Brandeis, 1890). The social 
concept is that one is not forced to communicate. To 
pursue undesired interaction is to harass or stalk. If 
someone knocks on our door, we need not answer. 
If they telephone, we need not pick up. But we get 
e-mail in our inbox, like it or not. 

E-mail systems could present new messages in 
two parts: an initial “Can I talk to you?” channel 



562 



Spam as a Symptom of Electronic Communication Technologies that Ignore Social Requirements 



request, then the messages and content. Channel 
requests could give channel properties like the sender, 
title, and reciprocity (if replies are accepted; Rice, 
1994), but not message content. Microsoft’s plan to 
offer caller ID for e-mail seems a step in the right 
direction as it gives some channel information to 
receivers, but why not give all channel information? 
Receivers could then only receive messages from 
those who also receive. Current challenge-spam 
defenses offer this service but transmit content 
multiple times, and if the challenge bounces, they 
multiply spam. 

Channel requests would send no content, only 
channel properties. The receiver can choose to open 
the channel or not. No third party need guarantee 
anything. No tedious challenges to sender humanity 
are needed. Sending messages is as before, except 
one could get a “channel unavailable” response. 
This is not a message rejection, but an unwillingness 
to talk at all. To receivers, messaging would also 
look the same, except unknown messages (like 
spam) would appear in a separate “Request to 
Converse” in-box, where users must double-click 
them to get content. Since most people do not click 
on spam, transmission volumes would reduce. Such 
handshaking occurs in data networks and could 
occur for e-mail. Giving known senders a permanent 
channel would create a self-generated list of known 
communicants (Hall, 1998). 

Who Owns E-Mail Addresses? 

The social concept of privacy suggests that people 
own their personal data. Good companies already 
include in their messages phrases such as, “To stop 
further e-mail, reply to this message.” Yet these 
voluntary acts are not enough. Spammers can feign 
them, or worse, use your reply to confirm an active 
e-mail and sell your address to others. Requesting 
removal could put you on even more lists, becoming 
what PC World magazine calls “spam bait” (Spring, 
2003): “By now, most computer users know that 
replying to most spam only generates more spam” 
(Woellert, 2003, p. 56). Yet if users managed their 
own online data records, they could save companies 
data-maintenance costs. 



FUTURE TRENDS 

Currently, spam is tolerated by technology as the 
bandwidth can handle it. However, this may not 
continue. Some hope technology will continue to 
expand bandwidth and processing beyond the spam 
challenge, but simple arithmetic suggests otherwise. 
The spam potential increases as the square of the 
number of users, which grows each day. In a future 
with billions of people online, the potential interac- 
tions are beyond any technology we can presently 
conceive. The predictions are gloomy. Given current 
trends, it seems there is nothing to stop spam from 
becoming over 95% of Internet transmissions in a 
decade. Meanwhile, society’s laws still struggle 
with telephone spam (telemarketing), let alone com- 
puter spam. The question seems to be not if e-mail 
will fail, but when. 

Some experts suggest e-mail is already “broken,” 
but will be replaced by new, and better, forms of 
communication (Boutin, 2004). Time will tell if this is 
true. If spam is a general social disease, it may cross 
application boundaries. Already, spim, a spam ver- 
sion of Internet instant messaging (Hamilton, 2004), 
is growing faster than spam ever did. Technology 
may not insulate us from antisocial acts in computer- 
mediated communication (CMC). 

Spam seems to be a watershed moment, a critical 
point at which traditional social values and technol- 
ogy power confront. The stakes are high. If human 
society loses its way in cyberspace, the vision of an 
electronic global society may fade. A brighter sce- 
nario is that the legitimate-communication require- 
ment will be recognized and technology redesigned 
accordingly; that is, the social-technical gap will 
close. Currently, the unity of global society is not 
political or legal, but technical. Society lets people 
return postal mail, but e-mail does not let people 
return messages. Society recognizes the right not to 
communicate, but e-mail gives a right to communi- 
cate. Society would let people remove themselves 
from marketing lists, but one cannot remove oneself 
from e-mail lists. Technology has the social require- 
ments backward. Spammers force messages upon 
us that we should be able to reject. They access in- 
boxes we should own. They control e-mail ad- 
dresses that should be ours. Technology gives 
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spammers every reason to do what they are doing, 
and no reason to stop. 

If the social-technical gap were reduced, spam 
would also reduce. If e-mail could be returned to the 
sender and really arrive there, spam would reduce. 
If spammers had to “knock” before entering an in- 
box, spam would reduce. If e-mail users could 
remove themselves from e-mail lists, spam would 
reduce. 

Such legitimacy-based changes have a unique 
property: They do not selectively discriminate spam 
or spammers. They would apply to all of us equally. 
Everyone’s personal data would be their personal 
property. Anyone could converse or not. Any e-mail 
could be rejected, not just spam. The goal is legiti- 
mate interaction, not punishment or revenge, to 
reduce unwanted mail from all of us, not just 
spammers. 

CONCLUSION 

These conclusions can be summarized as follows. 

1 . Technology advances alone, like filters, will not 
in the long run reduce spam. 

2. Traditional social solutions alone, like the law, 
will work poorly in cyberspace. 

3. Spam is a social problem that requires a social 
solution. 

4. The technical architecture of social-technical 
systems must support social requirements for 
social solutions to work. 

The growing flood of spam from spam-generat- 
ing to spam-filtering machines — information without 
meaning sent from no one to no one — seems a good 
place to start facing the social-technical challenge. 
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KEY TERMS 

Asynchronous Communication: E-mail is nor- 
mally considered asynchronous communication. 
Synchrony has been defined as “the extent to which 
individuals work together on the same activity at the 
same time” (Dennis & Valacich, 1999), but is e-mail 
synchronous if e-mail communicants are online at 
the same time? Another view is that synchrony 
requires instant transmission, but if e-mail became 
instantaneous, would it then be synchronous? Con- 
versely, consider a telephone (synchronous) conver- 
sation during which one party boards a rocket to 



565 




Spam as a Symptom of Electronic Communication Technologies that Ignore Social Requirements 



Mars; as the rocket leaves, there is a transmission 
delay of several minutes. Is the telephone now 
asynchronous communication? That the same me- 
dium is both synchronous and asynchronous is unde- 
sirable. Media properties should only change when 
the medium changes; that is, they should be defined 
in media terms, not sender-receiver or transmission 
terms. The asynchronous-synchronous difference is 
whether the medium stores the message or not. In 
this, e-mail remains asynchronous no matter how 
fast it is, and telephone synchronous no matter how 
slow it is. The asynchrony is between receiver and 
medium, not receiver and sender. The opposite is 
ephemerality , in which signals must be processed on 
arrival. 

Communication Environment: In one sense, 
technology operates in a physical environment, but 
for computer-mediated communication, technology 
is the environment, that is, that through which com- 
munication occurs. Telephone, CMC, and face to 
face (FTF) are all equally communication environ- 
ments. FTF is mediated by the physical world just as 
CMC is mediated by technology. One cannot com- 
pare environments as one does objects in an environ- 
ment. To judge one environment by another is like 
saying the problem with America is that it is not 
England. Describing e-mail as distributed rather 
than colocated is like this. If distributed e-mail 
correspondents magically colocate in the same room, 
what changes? In their environment, nothing changes 
at all. E-mail is not distributed or colocated because 
physical space does not exist in cyberspace. Nor do 
environments perform as objects do. Imagine a new 
environment called “underwater.” Users find walk- 
ing underwater painfully slow, then find a new way 
of moving (swimming) that fits the environment 
better, inventing flippers to support it. Now the new 
world seems better. Asking which environment is 
better at walking is inappropriate. Cross-media stud- 
ies (CMC vs. FTF) make this mistake of analysing 
electronic communication in face-to-face terms (Hiltz 
& Turoff, 1985). A better approach is within-envi- 
ronment research designs (Whitworth, Gallupe, & 
McQueen, 2001). 

Communication Threshold: The acceptable 
user cost to send a message (Reid et al., 1996). If the 
cost to send a message is less than the individual’s 
messaging threshold, it is sent. Otherwise, it is not. 



E-mail lowered the messaging threshold so more 
messages were sent than otherwise would be. 

Computer-Mediated Communication: CMC, 
like e-mail, is one-to-one, asynchronous communi- 
cation mediated by electronic means. List e-mail 
seems to be many-to-many communication, but the 
transmission system simply duplicates one-to-one 
transmissions. In true one-to-many transmissions, 
like a bulletin board, one communication operation is 
transmitted to many people (e.g., posting a mes- 
sage). 

Computer-Mediated Interaction: Computer- 
mediated interaction (CMI) is interaction mediated 
by electronic means, whether between people or 
computer agents. 

Cyberspace: Space is central to our lives, 
whether virtual or physical (Dodge & Kitchin, 2001). 
Gibson (1984) coined the term cyberspace from the 
Greek kyber (to navigate), describing a nonphysical 
space (the “matrix”) that substituted for reality. 
Today, it means the electronic environment that 
enables computer-mediated interaction. Cyberspace 
removes the physical space constraints of human 
interaction (Hauben, 1995) but is still a space, albeit 
of a different kind. Physical space locates us to a 
three-number coordinate position. Cyberspace also 
locates us to a unique URL (uniform resource 
locator) position. While physical locations have dif- 
fering distances between them, points in cyberspace 
seem equally distant. If one moves through 
cyberspace by mouse clicks, cyberspace points could 
have distances between them. In theory, every 
cyberspace point is one click from every other, but 
in practice, this is not so. Research on the diameter 
of the World Wide Web suggests an average of 19 
links between random points (Albert, Jeong, & 
Barabassi, 1999). 

False Positive: A filtering system can make 
two types of errors: false acceptance and false 
rejection. The latter is a false positive. A spam filter 
can wrongly let spam through, or wrongly filter real e- 
mail as spam. In false acceptance, it is not doing its 
job, while in false positives, it is doing it too well. 
Decreasing one type of error tends to increase the 
other, as with Type I and Type II errors in experimen- 
tal design. As the spam-filter catch rate rises above 
99.99%, the number of false positives also rises. 
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INTRODUCTION 

Computer Supported Collaborative 
Work (CSCW) 

Information Technology (IT) has a significant im- 
pact on our lives beyond mere information access 
and distribution. IT shapes access to services, tech- 
nology, and people. The design and use of IT can 
change people’s communication styles and the way 
they work, either individually or in a group. The 
recent introduction of groupware and Computer 
Supported Collaborative Work (CSCW) systems 
enables people to collaborate with fewer time and 
space constraints and affects people’s lives and 
their cultures in the long term. 

CSCW is a new and fast developing research 
field. The terms groupware and CSCW were coined 
in the mid-1980s. The study of CSCW and groupware 
could be defined as a middle field of research 
between the study of single user applications (e.g., 
human-computer interaction [HCI] research) and 
applications for organizations (e.g., information sys- 
tems [IS] or management information system [MIS] 
research) (Grudin, 1994). CSCW studies the way 
people work in groups as well as technological 
solutions that pertain to computer networking with 
associated hardware, software, services, and tech- 



niques (Wilson, 1991). There are several alternative 
labels used to denominate CSCW applications: 
groupware, group support systems (GSS), collabo- 
rative computing, workgroup computing, and mul- 
tiuse applications. 

Some of the key issues studied in CSCW include 
commuter-mediated communication, awareness and 
coordination, and multi-user interfaces. However, 
there has been very limited research to account for 
culture in CSCW. In this article, we discuss the role 
of culture in the design and implementation of CSCW 
systems that support work in cross-cultural con- 
texts. We first present two different perspectives on 
culture in the literature. We then review prior re- 
search in both HCI and IS fields and follow with a 
summary of preliminary research work in CSCW 
about cross-cultural group work. We conclude by 
discussing alternative approaches to design and by 
suggesting a theoretical tool that may inform future 
research on the cultural factors in CSCW. 



CULTURE 

Culture is “an integrated system of learned behavior 
patterns that are characteristic of the members of 
any given society. Culture refers to the total way of 
life of particular groups of people. It includes every- 
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thing that a group of people thinks, says, does and 
makes — its systems of attitudes and feelings. Cul- 
ture is learned and transmitted from generation to 
generation” (Kohls, 1996, p. 23). Two distinct per- 
spectives on culture are represented in the literature: 
culture is relatively constant vs. culture is variable 
and situated. The major advocate of the first per- 
spective (i.e. , culture is a constant entity based on 
shared assumptions) is Hofstede (1980), who de- 
fines culture as “the collective programming of the 
mind which distinguishes the members of one group 
or category of people from another” (p. 25). Re- 
searchers who hold the first perspective on culture 
also define culture as beliefs, values, and assump- 
tions that are reflected in artifacts, symbols, and 
behaviors (Kroeber & Kluckhohn, 1963). Schein 
(1992) defined organizational culture as a set of 
implicit assumptions shared within the group that 
determines its perspective of and reaction to various 
environments. 

The other perspective on culture characterizes it 
as variable, historically situated, and evolving with 
the context. Rather than being a holistic and rela- 
tively stable entity, culture is seen as fragmented, 
variable, contentious, andin-the-making (Brightman, 
1985; Prus, 1997). The values and attitudes of the 
working group affect the behavior of the group, 
whose collective patterns of behavior contributes to 
the group culture. The group culture, in return, has 
significant impact on the values and attitudes of the 
group. This cyclic relationship is true for not only 
working groups or organizations but also for nations 
(Davison & Jordan, 1996). 

BACKGROUND 

Culture: A Research Issue in Multiple 
Disciplines 

In this section, we review studies from different 
research fields that have investigated the role of 
culture in computer technology. We first describe 
prior research in HCI and IS (or MIS) literature. 
Then, we focus on studies that have accounted for 
cultural factors in CSCW and groupware. 



Current Research in HCI and Information 
System 

HCI researchers have investigated how cultural 
factors may affect design and evaluation of single- 
user applications (Barber & Badre, 1998; Marcus, 
2000; Marcus & Gould, 2000; Sheppard & Scholtz, 
1999). The research in this domain has focused on 
research issues such as cultural usability (Barber & 
Badre, 1998) and the design of intercultural user 
interfaces (UI) (Marcus, 2000). An instance of the 
impact of culture on UI design pertains to the 
meaning of colors. The color red, for example, in 
some cultures is associated with danger, anger, and 
so forth (Dix & Mynatt, 2004). In other cultures, 
such as in China, it is more commonly associated 
with happiness and good luck. Designing UI for 
multicultural audiences may require interfaces that 
adapt the standards to the cultural context of the 
specific audiences. 

Several IS (MIS) studies have investigated the 
influence of cultural factors on the use of informa- 
tion systems. Table 1, reproduced from Ward and 
Ward (2002), summarizes a number of studies on 
GSS and culture. Setting future agendas for IS 
research at the group level of analysis, Walsham 
(2000) observed, “There are clear agendas here for 
IS researchers to investigate in more detail the role 
of groupware in multi-cultural contexts” (p. 204). 

Culture Issue in CSCW and Groupware 

Located between HCI and IS research, CSCW has 
given increasing attention to cultural factors in CSCW 
and groupware. CSCW researchers have acknowl- 
edged the relevance of culture to appropriately 
design groupware and to successfully support coop- 
erative work. For example, Olson and Olson (2001) 
have observed that remote teams misunderstand 
each other because of cultural differences. Dix and 
his colleagues have observed that lack of consider- 
ation for different cultural perceptions and habits 
about personal space (proxemics) may have un- 
pleasant effects in cross-cultural meetings (Dix & 
Mynatt, 2004). The following section discusses two 
distinctive examples of system design that support 
cross-cultural communication. 
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Table 1. Research on GSS and culture (reproduced from Ward and Ward, 



s 



Author 


Activity 


Results 


Groups 

Researched 


Tan et al. 
(1993) 


Influence of 
minority source 


Status influence altered 




Aiken et al. 
(1993) 


Effective use of 
technology 


Effective regardless of culture of language 


Malaysia 

and 

American 

groups 


Watson et al. 
(1994) 


Adoption of 
technology 


Culture will shape adoption of GSS features 
Meeting designers need to match tools and 
communication to meeting goals and cultural 
norms 


Singapore 
and US 
groups 


Niederman 

(1997) 


New technology 
New meeting 
norms 


Reaction similar 
Some differences 




Aitkinson and 
Pervan (1998) 


Anonymity 


Higher productivity 


Four 

national 

groups 


Abdat and 
Pervan (1999) 






Indonesian 

groups 


Anderson 

(2000) 


Cognitive conflict 
task 


No difference for pre-meeting consensus, 
influence equality, and post-meeting consensus 
No difference for consensus change 
Higher levels of perceived process gains, 
perceived decision satisfaction, perceived 
decision process satisfaction, and perceived 
quality of discussion 


Multicultural 
and US 
groups 



DESIGN APPROACHES OF 
SUPPORTING CULTURE IN 
COLLABORATION 

Okamoto, Isbister, Nakanishi, and Ishida (2002) have 
designed and implemented large screen systems that 
support cross-cultural communication that happens 
synchronously with communicators either at the same 
location or in remote locations. In their large screen 
systems, communicators’ real images can be seen 
from the large screen, thus enabling their communi- 
cation through nonverbal cues. Communicators’ cul- 
tural backgrounds and shared information based on 
their profiles are presented on the large screen, 
including language knowledge, culture literacy and 
experience (e.g., how long the person has been 
immersed in the culture), and culture affinity and ties 
(e.g., how many friends the person has from certain 
countries). The idea of the system is to provide 
support for culture awareness to improve communi- 
cation. 

Grill, Kronsteiner, and Kotsis (2003) suggested 
creating a culture translation agent to support cross- 
cultural communication information. Using Hofstede’ s 



(1980) definition of culture as collective program- 
ming of the mind, Grill et al. (2003) assume that 
different programming of the minds leads to alter- 
native code bases (i.e., alternative common ground) 
in communication (Clark & Brennan, 1991). The 
authors propose the idea of implementing a cultural 
translation system that helps overcome the misun- 
derstanding in communication due to different code 
bases. In such a culture translation system, a cul- 
ture translation agent (CTA) is created as a modu- 
lar agent. Such an agent functions as a communica- 
tion support tool that monitors whether messages 
sent between communicators might cause misun- 
derstandings due to culture difference and notifies 
communicators about it. The CTA uses a matching 
algorithm to compare phrases and terms in the 
message with a code base constructed on code 
bases of the relevant cultures. 

Although the idea of implementing a CTA to 
support cross-cultural communication seems to be 
promising, overall, we consider the design approach 
of Okamoto et al. (2002) more favorable. Privileg- 
ing the perspective that culture is dynamic and 
context-dependent, we argue that a static code 
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base cannot reflect the dynamic features of culture 
(specifically referring to the cultural factors that 
significantly affect group collaboration). For ex- 
ample, things that normally would cause miscommu- 
nication because of cultural differences between the 
communicators rather may be understood well be- 
cause one has been exposed to the other’s culture 
for an extended period of time. In this case, a culture 
translator may not be useful for communication. 
Instead, the existence of a translator may be an 
obstacle for communicators to learn each other’s 
culture, which could be a positive outcome of cross- 
cultural communication. Compared to CTA, the 
approach of Okamoto et al. (2002) takes into ac- 
count the dynamic features of culture (e.g., an 
individual’s culture literacy and experience are pro- 
vided on the large screen, and communicators are 
able to see each other and communicate directly). 
Thus, the system supports cross-cultural communi- 
cation by providing individual cultural background 
information while simultaneously enabling face-to- 
face communication. 

We believe that appropriately supporting cross- 
cultural coordination represents a new challenge for 
CSCW design. In fact, people from different cul- 
tures may have different value systems and attitudes 
toward the same activity (e.g., expectations and 
assumptions on labor division and deadlines), differ- 
ent understanding of rules of the group, and so forth. 
Such differences generally affect both work rela- 
tionships and group performance. 

FUTURE TRENDS 

Activity Theory: A Useful Theoretical 
Tool 

Activity theory is a useful tool to understand cultural 
mediation in human activities. In agreement with 
ecological approaches to HCI and in contrast to 
individual-centric theories, activity theory empha- 
sizes the connection rather than the separation be- 
tween human cognition and human action (Bpdker, 
2003). Culture is viewed as a primary mediator in 
human activities. 

The cultural-historical approach put forth by 
Russian cultural-historical scholars Leontiev, Luria, 



and Vygotsky draws on Marx’s historical material- 
ism and focuses on the function of culture in human 
development by considering the contributions of 
cultural artifacts, historical development, and prac- 
tical activity (Cole, 1998). Activity theory was born 
from this perspective, where the primary unit of 
analysis is the activity (i.e., the fundamental type of 
context) (Bpdker, 1991; Korpela, Mursu & Soriyan, 
2001). Building on this basis, Engestrom (1987) has 
depicted the intertwined relationships among sub- 
ject, object, and community of the activity through a 
triangular model (Figure 1). The central subject- 
object-community triangle then is extended to in- 
clude sociocultural forms of mediation: instruments, 
rules, and division of labor (Engestrom, 1987). 

Supporting collaborators’ awareness has been a 
central concern for CSCW researchers. Globally, 
three major forms of awareness have been studied 
in CSCW research: social awareness (who is 
present), action awareness (what are they doing), 
and the more general awareness of the entire activ- 
ity (Carroll, Neale, Isenhour, Rosson & McCrickard, 
2003). With the aim of accounting for cultural fac- 
tors in group cooperation, we suggest the inclusion of 
cultural mediation as part of the activity awareness 
concept. Specifically, drawing on Engestrom’ s (1987) 
activity model, we propose a comprehensive con- 
cept of awareness in CSCW, which accounts for 
collaborators’ awareness of cultural mediators, such 
as group norms and rules, division of labor, and 
collaborative tools. 

However, Engestrom’ s (1987) model is based on 
the assumption of a single, shared cultural context. 
This model needs to be extended in order to describe 
and explain collaborative phenomena among people 
of different cultures. In fact, different cultures gen- 
erally imply different artifacts, rules, and ways of 
dividing labor. 

Cross-cultural collaboration requires the addi- 
tional task of negotiating meanings at a cultural level. 
Future research issues about awareness of cultural 
mediation in CSCW include the study of awareness 
breakdowns due to lack of visibility or misunder- 
standings about cultural differences ; the study of the 
process of building common ground (Clark & 
Brennan, 1991) in cross-cultural settings; and the 
study of the influence of cultural background infor- 
mation on group performance. 



570 



Supporting Culture in Computer-Supported Cooperative Work 



Figure 1. Engestrom ’s model of human activity 
(1987) 




CONCLUSION 

In this article, we have reviewed the current under- 
standing of culture as a factor in CSCW. Using two 
examples to illustrate different approaches to design 
CSCW systems that support cross-cultural commu- 
nication (culture translation system vs. support for 
cross-cultural communication and awareness), we 
gave suggestions for system design that takes into 
account the culture factor. We have also suggested 
directions of future research on the culture factor in 
CSCW and groupware. We suggested the introduc- 
tion of culture mediation awareness to the concept 
of activity awareness. The best solution is CSCW 
systems that support culture mediation awareness 
by providing information to users about group cul- 
ture. 
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KEY TERMS 

Activity Theory: Construes activity as a collec- 
tive phenomenon. Activity is pursued by individuals 
or groups within a community working toward shared 
objectives or motives and recruiting and transform- 
ing the material environment, including shared tools, 
data, social and cultural structures, and work prac- 
tices (Kuutti, 1991). 

Awareness: “An understanding of the activities 
of others, which provides a context for your own 
activity” (Dourish & Bellotti, 1992, p. 107). 

Computer-Supported Cooperative Work 
(CSCW): A field located between HCI and IS 
research fields, CSCW studies the way people work 
in groups as well as technological solutions that 
pertain to computer networking with associated 
hardware, software, services, and techniques (Wil- 
son, 1991). 



Context: The structure or environment where 
special interactions occur (Giddens, 1984). 

Culture: “An integrated system of learned be- 
havior patterns that are characteristic of the mem- 
bers of any given society. Culture refers to the total 
way of life of particular groups of people. It includes 
everything that a group of people thinks, says, does 
and makes — its systems of attitudes and feelings. 
Culture is learned and transmitted from generation 
to generation (Kohls, 1996, p. 23). 



s 



Groupware: “Computer-based systems that sup- 
port groups of people engaged in a common task (or 
goal) and that provide an interface to a shared 
environment” (Ellis et al., 1991, p. 40). 



Hofstede’s Cultural Dimensions: Hofstede 
identified five dimensions of national culture: power 
distance, uncertainty avoidance, individualism, mas- 
culinity, and long-term time vs. short-term orienta- 
tion. 
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INTRODUCTION 

Computers have become commonplace tools in edu- 
cational environments and are used to provide both 
basic and supplemental instruction to students on a 
variety of topics. Searching for information in 
hypermedia documents, whether on the Web or 
through individual educational sites, is a common 
task in learning activities. Previous research has 
identified a number of variables that impact how 
students use electronic documents. Individual dif- 
ferences such as learning style or cognitive style 
(Andris, 1996; Fitzgerald & Semrau, 1998), prior 
topic knowledge (Ford & Chen, 2000), level of 
interest (Lawless & Kulikowich, 1998), and gender 
(Beasley & Vila, 1992) all influence performance. 
Additionally, characteristics of the document such 
as the inherent structure of the material, the linking 
structure (Korthauer & Koubek, 1994), and the 
types of navigation tools that accompany the docu- 
ment can affect student performance and behaviour 
(Boechler & Dawson, 2002; McDonald & Stevenson, 
1998, 1999). In short, the effective use of hypermedia 



documents in educational settings depends on com- 
plex interactions between individual skills (e.g., spa- 
tial and reading skills) and the features of the docu- 
ment itself. 



BACKGROUND 

Previous research has suggested that one way of 
addressing ability differences in hypermedia users is 
to follow a compensatory strategy in which users are 
provided with mediators, modalities, or organizing 
structures that make up for a deficit in a particular 
ability (Messick, 1976). One kind of organizing 
structure that can help users make sense of material 
is a spatial structure that illustrates how different 
parts of the material are related. A spatial map, 
spatial overview, or graphic organizer is a visual 
representation of the structure of the document. 
These are usually in a diagrammatic form such as 
block diagrams, diagrams organized around a central 
term (spider map), or hierarchically ordered tree 
diagrams. For example, see Figure 1. 



Figure 1. An example of a hierarchically ordered tree diagram 
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Learning depends on the construction of stable 
and usable mental representations of knowledge. 
From an educational perspective, how do we induce 
such representations? When presenting factual (e.g., 
some apples are red) or demonstrable (e.g., gravity 
makes things fall downward) information, creating 
an appropriate mental representation is a matter of 
relying on these physical aspects of the world to 
stand as mental representations to be stored, ma- 
nipulated, or retrieved. The creation of mental rep- 
resentations for abstract and complex ideas is not as 
straightforward. In the cognitive-psychology litera- 
ture, it is suggested that people often use spatial 
structures as metaphors to reason out the relational 
attributes of a set of abstract elements, attributes 
that are not observable (Gattis, 2001). Research 
across several bodies of literature (educational psy- 
chology, information science, instructional technol- 
ogy) suggests that the arrangement of visual infor- 
mation in particular in a hypermedia interface can 
impact both navigation (Allen, 2000; Boechler & 
Dawson, 2002; Chen, 2000; Westerman & Cribbin, 
2000) and learning (Boechler & Shaddock, 2004; 
Mayer & Sims, 1994; Moreno & Mayer, 1999). In 
both cases, a successful spatial or visual arrange- 
ment should make salient the relations between 
semantic elements in the document. Concerning 
navigation, in the information-science literature, Dillon 
(2000) proposes a spatial and semantic model to 
explain hypermedia navigation processes. The spa- 
tial and semantic model assumes all information 
spaces convey structural cues that are both spatial 
and semantic in nature, and that different user 
characteristics and contexts determine which type 
of cues will be relied on in relation to one another. 
Similarly, in the educational-psychology literature, 
Mayer and Sims propose a dual-coding theory to 
explain learning in hypermedia. In the dual-coding 
theory of multimedia learning, learners construct 
referential connections between the mental repre- 
sentations of the verbal and visual information pre- 
sented within a hypermedia document. Hence, the 
underlying assumption is that, for both navigation 
and learning, the impact of the visual arrangement 
lies in the degree to which it preserves the meaning 
relations between different parts of the document 
material. This mapping of verbal and visual elements 
can be accomplished using multitudes of diverse 



visual cues (spatial separation, clustering, bordering, 
connecting lines, etc.). 



s 



THE EFFECTIVENESS OF GRAPHIC 
ORGANIZERS 

Although not all studies support the positive effects 
of graphic organizers (e.g., Farris, Jones, & Elgin, 
2002; Stanton, Taylor, & Tweedie, 1992), in the 
hypermedia literature, there are many examples of 
the usefulness of graphic organizers. For instance, 
Stanny and Salvendy (1995) found that the perfor- 
mance of low-spatial-ability users could be im- 
proved to the level of high-spatial-ability users by 
providing a 2-D (two-dimensional) hierarchical struc- 
ture as a guide for users. Allen (2000) found that 
low-spatial-ability users performed better when pro- 
vided with a word map: a configuration that showed 
the relationships between words in a bibliographic 
collection. 

McDonald and Stevenson (1999) reported on 
two studies examining the effects of navigational 
aids on navigation and learning. The first study 
indicated that providing a spatial map improved 
navigation performance over using a content list or 
no navigation tool. In this case, the map consisted of 
labels with connecting lines indicating the links be- 
tween nodes. Navigation performance was mea- 
sured by task time and the number of extraneous 
pages accessed. However, this type of spatial map 
did not improve recall for the document material. 
The second study showed that providing a spatial 
map that also included link descriptions that showed 
the conceptual relations between the pages im- 
proved learning. 

Boechler and Shaddock (2004) found that the 
presence of visual links between page labels in a 
navigation tool predicted incidental learning of mate- 
rial during an information-search task. Whether the 
navigation tool was two dimensional or three dimen- 
sional did not predict these learning outcomes. 

Nilsson and Mayer (2002) reported two studies 
using graphic organizers. They concluded that there 
are benefits to graphic organizers, but that such 
benefits come at the expense of other aspects of 
performance. Specifically, a graphic organizer can 
assist users in navigation, but if the organizers make 



575 




Supporting Navigation and Learning in Educational Hypermedia 



the task of navigating too easy, it is less likely users 
will integrate the information they have viewed. 
Clearly, not all features exhibited in graphic organiz- 
ers are as effective at enhancing learning but, in 
general, graphic organizers do seem to assist in the 
navigation process and in some instances assist learn- 
ing as well. 

Why do Graphic Organizers Help? 

Researchers have suggested two reasons that graphic 
organizers assist learners (Nilsson & Mayer, 2002). 
First, they reduce the cognitive overhead that stu- 
dents must expend by providing a framework for 
people to take in new information. When learners use 
hypermedia documents, they must remember which 
material was shown and determine how the material 
is related. In other words, they must form a meaning- 
ful representation in memory of how the material is 
organized. Providing this organization up front through 
a graphic organizer lessens the effort that learners 
need to expend to understand the meaning of the 
material. Second, graphic organizers help learners to 
not become disoriented as they move through the 
document. Disorientation occurs when the user does 
not know where to go next, the user knows where to 
go but not how to get there, or the user does not know 
where he or she is in relation to the overall structure 
of the document. The less cognitive resources a 
learner needs to use to navigate the document, the 
more resources are available to actually learn the 
material. 



FUTURE TRENDS 

Many studies, such as those reviewed above, that 
evaluate the usefulness of graphic organizers or 
spatial maps report positive effects. Other studies 
report gains in some areas of performance accompa- 
nied by losses in different areas. What is true for all 
such studies is that these studies have many diverse 
characteristics in the features of the graphic orga- 
nizer, in the task that users are required to perform, 
and in the different cognitive abilities of the users 
themselves. Future research must seek to reveal and 
synthesize how these different variables interact 
before a complete understanding of the role of graphic 
organizers can be achieved. 



CONCLUSION 

People may need different kinds of interface sup- 
port to learn effectively in hypermedia environ- 
ments. Understanding the relationships between 
individual skills and types of support will help edu- 
cational designers provide the optimum interface 
for students of different needs. Graphic or spatial 
organizers may be one useful tool for providing such 
support. Cognitive and learning theories can pro- 
vide guidance for exploring the interactions that 
occur between the interface characteristics and 
individual differences for both navigation and learn- 
ing outcomes. 
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KEY TERMS 

Cognitive Overhead: The amount of mental 
resources that need to be expended to complete a 
given task. 

Cognitive Style: Cognitive style has been de- 
fined as “an individual’s characteristic and consis- 
tent approach to organizing and processing informa- 
tion” (Tennant, 1988, p. 89). 

Compensatory Strategy: An educational ap- 
proach that focuses on providing structures that 
support and enhance learners’ weaknesses rather 
than exploiting their strengths. 

Disorientation: The sensation of feeling lost in 
a hypermedia document, characterized by three 
categories of the user’s experience: (a) The user 
does not know where to go next, (b) the user knows 
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where to go but not how to get there, or (c) the user 
does not know where he or she is in relation to the 
overall structure of the document. 

Dual-Coding Theory of Multimedia Learn- 
ing: A theory of learning in hypermedia (Mayer & 
Sims, 1994) that is a process account of how learn- 
ers build mental connections between the verbal 
material, the visual material (e.g., images or dia- 
grams), and the meaning that links the two together. 

Spatial Ability: Spatial ability refers to a person’ s 
ability to perceive, retain, and mentally manipulate 
different kinds of spatial information. There are 



numerous types of spatial ability (e.g., scanning 
ability, visualization) that can be measured by stan- 
dardized tests to detect ability differences between 
learners. 

Spatial and Semantic Model: A model of 
hypermedia navigation proposed by Dillon (2000) 
that is based on the notion of an information space 
for users: “The concept of shape assumes that an 
information space of any size has both spatial and 
semantic characteristics. That is, as well as identify- 
ing placement and layout, users directly recognize 
and respond to content and meaning” (p. 523). 
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INTRODUCTION 

The history of task analysis is nearly a century old, 
with its roots in the work of Gilbreth (1911) and 
Taylor (1912). Taylor’s scientific management pro- 
vided the theoretical basis for production-line manu- 
facturing. The ancient manufacturing approach us- 
ing craft skill involved an individual, or a small group, 
undertaking, from start to finish, many different 
operations so as to produce a single or small number 
of manufactured objects. Indeed, the craftsperson 
often made his or her own tools with which to make 
end products. Of course, with the growth of civilisation 
came specialisation, so that the carpenter did not fell 
the trees or the potter actually dig the clay, but still 
each craft involved many different operations by 
each person. Scientific management’s novelty was 
the degree of specialisation it engendered: each 
person doing the same small number of things re- 
peatedly. 

Taylorism thus involved some large operation, 
subsequently called a task, that could be broken 
down into smaller operations, called subtasks. Task 
analysis came into being as the method that, accord- 
ing to Anderson, Carroll, Grudin, McGrew, and 
Scapin (1990), “refers to schemes for hierarchical 
decomposition of what people do.” The definition of 
a task remains a “classic and under-addressed prob- 
lem” (Diaper, 1989b). Tasks have been differently 
defined with respect to their scope: from the very 
large and complex, such as document production 
(Wilson, Barnard, & MacLean, 1986), to the very 
small, for example, tasks that “may involve only one 
or two activities which take less than a second to 
complete, for example, moving a cursor” (Johnson 
& Johnson, 1987). Rather than trying to define what 
is a task by size, Diaper’s (1989b) alternative is 
borrowed from conversation analysis (Levinson, 
1983). Diaper suggests that tasks always have well- 
defined starts and finishes, and clearly related activi- 



ties in between. The advantage of such a definition 
is that it allows tasks to be interrupted or to be 
carried out in parallel. 

Task analysis was always involved with the 
concept of work, and successful work is usually 
defined as achieving some goal. While initially ap- 
plied to observable, physical work, as the field of 
ergonomics developed from World War II, the task 
concept was applied more widely to cover all types 
of work that “refocused attention on the information 
processing aspect of tasks and the role of the human 
operator as a controller, planner, diagnostician and 
problem solver in complex systems” (Annett & 
Stanton, 1998). With some notable exceptions dis- 
cussed below, tasks are still generally defined with 
people as the agents that perform work. For ex- 
ample, Annett and Stanton defined task analysis as 
“[mjethods of collecting, classifying and interpreting 
data on human performance.” 

BACKGROUND 

Stanton (2004) suggests that “[sjimplistically, most 
task analysis involves (1) identifying tasks, (2) col- 
lecting task data, (3) analyzing this data so that the 
tasks are understood, and then (4) producing a 
documented representation of the analyzed tasks (5) 
suitable for some engineering purpose.” While there 
are many similar such simplistic descriptions, 
Stanton’s five-item list provides an adequate de- 
scription of the stages involved in task analysis, 
although the third and fourth are, in practice, usually 
combined. The following four subsections deal with 
them in more detail, but with two provisos. First, one 
should always start with Stanton’s final item of 
establishing the purpose of undertaking a task analy- 
sis. Second, an iterative approach is always desir- 
able because how tasks are performed is compli- 
cated. 
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The Purpose of a Task Analysis 

Task analysis has many applications that have noth- 
ing to do with computer systems. Even when used in 
HCI (human-computer interaction), however, task 
analysis can contribute to all the stages of the 
software-development life cycle. In addition, task 
analysis can make major contributions to other ele- 
ments associated with software development, in 
particular the preparation of user-support systems 
such as manuals and help systems, and for training, 
which was the original application of hierarchical 
task analysis (HTA; Annett & Duncan, 1967 ; Annett, 
Duncan, Stammers, & Gray, 1971). HTA was the 
first method that attempted to model some of the 
psychology of people performing tasks. 

Although infrequently documented, identifying 
the purposes for using task analysis in a software 
project must be the first step (Diaper, 1989a) be- 
cause this will determine the task selection, the 
method to be used, the nature of the outputs, and the 
level of analysis detail necessary. The latter is vital 
because too much detailed data that does not subse- 
quently contribute to a project will have been expen- 
sive to collect, and too high a level will require 
further iterations to allow more detailed analysis 
(Diaper, 1989b, 2004). Decomposition-orientated 
methods such as HTA partially overcome the level- 
of-detail problem, but at the expense of collecting 
more task data during analysis. Collecting task data 
is often an expensive business, and access to the 
relevant people is not always easy (Coronado & 
Casey, 2004; Degen & Pedell, 2004; Greenberg, 
2004). Within a software-development life cycle, 
Diaper (2004) has suggested that one identify all the 
stages to which a task analysis will contribute and 
then make selections on the basis of where its 
contribution will be greatest. 

Identifying Tasks 

In the context of task scenarios, which Diaper 
(2002a, 2002b) describes as “low fidelity task simu- 
lations,” Carroll (2000) rightly points out that “there 
is an infinity of possible usage scenarios.” Thus, only 
a sample of tasks can be analysed. The tasks chosen 
will depend on the task analysis’ purpose. For new 
systems, one usually starts with typical tasks. For 
existing systems and well-developed prototypes, one 



is more likely to be concerned with complex and 
difficult tasks, and important and critical ones, and, 
when a system is in use, tasks during which failures 
or problems have occurred. Wong (2004) describes 
his critical decision method as one way of dealing 
with the latter types of tasks. 

Unless there are overriding constraints within a 
software project, then task analysts should expect, 
and accept, the need to be iterative and repeatedly 
select more tasks for analysis. Since the coverage of 
all possible tasks can rarely be complete, there is a 
need for a systematic task selection approach. There 
are two issues of coverage: first, the range of tasks 
selected, and second, the range of different ways 
that tasks may be carried out, both successfully and 
unsuccessfully. 

One criticism of task analysis is that it requires 
extant tasks. On the other hand, all tasks subjected 
to task analysis are only simulations as, even when 
observed in situ, a Hiesenberg effect (Diaper, 1989b) 
can occur whereby the act of observation changes 
the task. Often, it is desirable to simulate tasks so 
that unusual, exceptional, and/or important task in- 
stances can be studied and, of course, when a new 
system or prototype is not available. 

Collecting Task Data 

There are many myths about task analysis (Diaper et 
al. , 2003), and one of the most persistent involves the 
detailed observation of people performing tasks. Some- 
times, task-analysis data do involve such observation, 
but they need not, and often it is inappropriate even 
with an existing system and experienced users. 

Johnson, Diaper, and Fong (1984; see also Dia- 
per, 1989b, 200 1 ) claim that one of the major strengths 
traditionally associated with task analysis is its capa- 
bility to integrate different data types collected using 
different methods. The critical concept is that of 
fidelity. According to Diaper (2002a, 2002b), “fidel- 
ity, a close synonym is validity, is the degree of 
mapping that exists between the real world and the 
world modelled by the (task) simulation,” although 
as he says parenthetically, “N.B . slightly more accu- 
rately perhaps, from a solipsistic position, it is the 
mapping between one model of the assumed real 
world and another.” 

At one end of the task-fidelity spectrum there is 
careful, detailed task observation, and at the other, 
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when using scenarios of novel future systems, task 
data may exist only in task analysts’ imagination. 
Between, there is virtually every possible way of 
collecting data: by interviews, questionnaires, classi- 
fication methods such as card sorting, ethnography, 
participative design, and so forth. Cordingley (1989) 
provides a reasonable summary of many such meth- 
ods. The primary constraint on such methods is one 
of perspective, maintaining a focus on task perfor- 
mance. For example, Diaper (1990) describes the 
use of task-focused interviews as an appropriate 
source of data for a requirements analysis of a new 
generation of specialised computer systems that were 
some years away from development. 

Task Analysis and Task Representation 

The main representation used by virtually all task 
analysis methods is the activity list, although it goes 
by many other names such as a task protocol or 
interaction script. An activity list is a prose descrip- 
tion of one or more tasks presented as a list that 
usually has a single action performed by an agent on 
each line. Each action on an activity-list line may 
involve one or more objects, either as the target of the 
action or as support for the action, that is, as a tool. 
An important component of an activity list should be 
the identification of triggers (Dix, Ramduny-Ellis, & 
Wilkinson, 2004). While most tasks do possess some 
sequences of activity list lines in which the successful 
completion of an action performed on one line trig- 
gers the next, there are many cases when some 
event, either physical or psychological, causes one of 
two or more possible alternatives to occur. 

Diaper (2004) suggests that an activity list is 
sometimes sufficient to meet a task analysis’ pur- 
poses. He suggests that one of the main reasons for 
the plethora of task analysis methods is the volume of 
data represented in the activity list format, often tens, 
if not hundreds, of pages. As Benyon and Macaulay 
(2002) discuss, the role of task analysis methods 
applied to activity lists is not only to reduce the sheer 
amount of the data, but to allow the data to be 
abstracted to create a conceptual model for designers. 

The two oldest and most widely cited task-analy- 
sis methods are HTA (Annett, 2003, 2004; Shepherd, 
2001), and goals, operators, methods, and selection 
rules (GOMS; Card, Moran, & Newell, 1983; John & 
Kieras, 1996; Kieras, 2004). HTA is often misunder- 



stood in that it produces, by top-down decomposi- 
tion, a hierarchy of goals, and these are often 
confused with physical or other cognitive activities. 
HTA uses rules to allow the goal hierarchy to be 
traversed. Analyses such as HTA provide a basic 
analysis (Kieras) that can then be used by methods 
such as GOMS. While often perceived as too 
complicated, it is claimed that GOMS provides good 
predictive adequacy of both task times and errors. 

There are between 20 and 200 task analysis 
methods depending on how one counts them. This 
presents a problem as different methods have dif- 
ferent properties and are suitable for different 
purposes. An agreed taxonomy of methods for 
method selection is still unavailable. In Diaper and 
Stanton (2004a), there are half a dozen different 
taxonomies. Diaper (2004), rather depressingly, 
suggests, “in practice, people either choose a task 
analysis method with which they are familiar or 
they use something that looks like HTA.” 

Limbourg and Vanderdonkt (2004) produced a 
taxonomy of nine task analysis methods, abridged in 
Table 1 . The methods have been reorganized so that 
they increase in both complexity and expressiveness 
down the table. References and further descriptions 
can be found in Diaper and Stanton (2004a). 

As can be seen from Table 1, there is no ac- 
cepted terminology across task analysis methods. 
An exception, noted by Diaper and Stanton (2004b), 
is that of goals and their decomposition and 
generalisation. 

A number of recent attempts have been made to 
classify tasks into a small number of subtasks 
(Carroll, 2000; Sutcliffe, 2003; Ormerod & Shep- 
herd, 2004). The latter’s subgoal template (SGT) 
method, for example, classifies all information han- 
dling tasks into just four types: act, exchange, 
navigate, and monitor. Underneath this level, they 
have then identified 1 1 task elements. The general 
idea is to simplify analysis by allowing the easy 
identification of subtasks, which can sometimes be 
reused from previous analyses. 

TASK ANALYSIS AT THE HEART OF 
HUMAN-COMPUTER INTERACTION 

Diaper and Stanton (2004b) claim that “[tjoday, 
task analysis is a mess.” Introducing Diaper (2002c), 
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Table 1. An abridged classification of some task analysis methods (based on Limbourg & Vanderdonkt, 
2004 ) 



Method 


Origin 


Planning 


Operational- 

isation 


Hierarchy 

Leaves 


Operational 

Level 


HTA 


Cognitive 

analysis 


Plans 






Tasks 


GOMS 


Cognitive 

analysis 


Operators 


Methods & 
selection rules 


Unit tasks 


Operators 


MAD* 


Psychology 


Constructors 


Pre- & 

postconditions 




Tasks 


GTA 


Computer- 

supported 

cooperative 

work 


Constructors 




Basic tasks 


Actions & 

system 

operations 


MUSE 


Software 
engineering 
& human 
factors 


Goals & 
constructors 




Actions 


Tasks 


TKS 


Cognitive 
analysis & 
software 
engineering 


Plans & 
constructors 


Procedures 


Actions 




CTT 


Software 

engineering 


Operators 


Scenarios 


Basic tasks 


Actions 


Dianne+ 


Software 
engineering 
& process 
control 


Goals 


Procedures 




Operations 


TOOD 


Process 

control 


Input/output 

transitions 






Task 



Kilgour suggests that Diaper should consider the 
“rise, fall and renaissance of task analysis.” While 
Diaper argues that really there has been no such fall, 
Kilgour is right that there was a cultural shift within 
HCI in the 1990s away from explicitly referring to 
task analysis. Diaper’s oft-repeated argument has 
been that whatever it is called, analysing tasks has 
remained essential and at the heart of virtually all 
HCI work. Diaper (2002a, 2002b) comments, “It 
may well be that Carroll is correct if he believes that 
many in the software industry are disenchanted with 
task analysis... It may well be that the semantic 
legacy of the term “task analysis” is such that 
alternatives are now preferable.” 

TASK ANALYSIS TODAY 

Central to Diaper’s current definition of task analy- 
sis, and the primary reason why task analysis is at 
the heart of virtually all HCI work, is the concept of 
performance. His definition (Diaper, 2004; Diaper 
et al., 2003) is as follows: 



Work is achieved by the work system making 
changes to the application domain. The 
application domain is that part of the assumed 
real world that is relevant to the functioning of 
the work system. A work system in HCI consists of 
one or more human and computer components 
and usually many other sorts of thing as well. 
Tasks are the means by which the work system 
changes the application domain. Goals are 
desired future states of the application domain 
that the work system should achieve by the tasks 
it carries out. The work system ’s performance is 
deemed satisfactory as long as it continues to 
achieve its goals in the application domain. Task 
analysis is the study of how work is achieved by 
tasks. 

Most models and representations used in soft- 
ware engineering and HCI are declarative; that is, 
they describe things and some of the relationships 
between things, but not the processes that transform 
things over time. For example, data-flow diagrams 
are atemporal and acausal and specify only that data 
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may flow, but not when and under what circum- 
stances. In contrast, it is essential, for all successful 
task-analytic approaches, that performance is mod- 
eled because tasks are about achieving work. 

Based on Dowell and Long’s (1989) and Long’s 
(1997) general HCI design problem, Diaper’ s (2004) 
systemic task analysis (STA) approach emphasizes 
the performance of systems. While STA is offered 
as a method, it is more of an approach in that it deals 
with the basics of undertaking the early stages of a 
task analysis and then allows other analysis methods 
and their representations to be generated from its 
activity list output. The advantage of STA over most 
other task analysis methods is that it models systems 
and, particularly, the performance of the work sys- 
tem. STA allows the boundary definition of a work 
system to change during a task so that different 
constituent subtasks involve differently defined work 
systems, and these may also be differently defined 
for the same events, thus allowing alternative per- 
spectives. 

The novelty of STA’s view of work systems is 
threefold. First, as the agent of change that performs 
work, a work system in HCI applications is not 
usually anthropocentric, but a collection of things, 
only some of them human, that operate together to 
change the application domain. Second, it is the work 
system that possesses goals concerning the desired 
changes to the application domain rather than the 
goals being exclusively possessed by people. Third, 
STA is not monoteleological, insisting that work is 
never achieved to satisfy a single goal, but rather it 
states that there are always multiple goals that 
combine, trade off, and interact in subtle, complex 
ways. 

STA’s modeling of complex work systems has 
recently been supported by Hollnagel (2003b) in 
cognitive task design (CTD); he claims that “cogni- 
tion is not defined as a psychological process unique 
to humans, but as a characteristic of systems perfor- 
mance, namely the ability to maintain control. The 
focus of CTD is therefore the joint cognitive system, 
rather than the individual user.” Hollnagel’ s formu- 
lation of CTD is more conservative than STA’s; for 
example, CTD is sometimes monoteleological when 
he refers to a single goal, and he restricts nonhuman 
goals to a limited number of things, albeit “a growing 
number of technological artefacts” capable of cog- 
nitive tasks. In STA, it is not some limited number of 



technological artefacts that possess goals and other 
cognitive properties, but the work system, which 
usually has both human and nonhuman components. 

FUTURE TRENDS 

While recognising the difficulty, perhaps impossibil- 
ity, of reliably predicting the future, Diaper and 
Stanton (2004b) suggest that one can reasonably 
predict possible futures, plural. They propose that 
“[fjour clusters of simulated future scenarios for 
task analysis organized post hoc by whether an 
agreed theory, vocabulary, etc., for task analysis 
emerges and whether task analysis methods become 
more integrated in the future.” While not predicting 
which future or combination will occur, or when, 
they are however confident that “[pjeople will al- 
ways be interested in task analysis, for task analysis 
is about the performance of work,” even though 
they admit that “[ljess certain is whether it will be 
called task analysis in the future.” 

Probably because of its long history, there is an 
undoubted need for the theoretical basics that under- 
pin the task concept and task analysis to be revisited, 
as Diaper (2004) attempts to do for the development 
of STA. Diaper and Stanton (2004b) also suggest 
that some metamethod of task analysis needs to be 
developed and that more attention needs to be 
placed on a wide range of types of validation, theory, 
methods, and content, and also on methods’ predic- 
tive capability to support design and for other engi- 
neering purposes (Annett, 2002; Stanton, 2002; 
Stanton & Young, 1999). At least two other areas 
need to be addressed in the future: first, how work is 
defined, and second, the currently ubiquitous con- 
cept of goals. 

Task analysis has always been concerned with 
the achievement of work. The work concept, how- 
ever, has previously been primarily concerned with 
employment of some sort. What is needed, as Karat, 
Karat, and Vergo (2004) argue, is a broader defini- 
tion of work. Their proposals are consistent with 
STA’s definition of work being about the work 
system changing the application domain. They per- 
suasively argue for nonemployment application do- 
mains, for example, domestic ones. Thus, a home 
entertainment system, television, or video game, for 
example, could be components of a work system. 
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and the goals to be achieved would be to induce 
pleasure, fun, or similar feelings in their users. That 
such application domains are psychological and in- 
ternal to such work systems’ users, rather than the 
more traditional changes to things separate and 
external to some work system’s components, is also 
consistent with STA’s conceptualisation of task 
analysis. 

Finally, Diaper and Stanton (2004b) broach, in- 
deed they attempt to capsize, the concept of goals. 
They question whether the goals concept is neces- 
sary, either as what causes behavior or as an expla- 
nation for behavior, which they suggest, based on 
several decades of social psychological research, is 
actually usually post hoc; that is, people explain why 
they have behaved in some manner after the event 
with reference to one or more goals that they 
erroneously claim to have possessed prior to the 
behavior. Not only in all task analysis work, but in 
virtually every area of human endeavour, the con- 
cept of goals is used. Abandoning the concept as 
unnecessary and unhelpful is one that will continue 
to meet with fierce resistance since it seems to be a 
cornerstone of people’s understanding of their own 
psychology and, hence, their understanding of the 
world. On the other hand, academic researchers 
have a moral duty to question what may be widely 
held shibboleths. Currently, goal abandonment is 
undoubtedly a bridge too far for nearly everyone, 
which is why STA still uses the goals concept, but 
greater success, if not happiness, may result in some 
distant future if the concept is abandoned. At the 
least, it is time to question the truth and usefulness of 
the goals concept. 

CONCLUSION 

Two handbooks (although at about 700 pages each, 
neither is particularly handy) on task analysis have 
recently become available: Diaper and Stanton 
(2004a) and Flollnagel (2003a). Both are highly 
recommended and, while naturally the author pre- 
fers the former because of his personal involvement, 
he also prefers the Diaper and Stanton tome because 
it provides more introductory material, is better 
indexed and the chapters more thoroughly cross- 
referenced, comes with a CD-ROM of the entire 
book, and, in paperback, is substantially cheaper 



than Ffollnagel’ s book. No apology is made for citing 
the Diaper and Stanton book frequently in this 
article, or for the number of references below, 
although they are a fraction of the vast literature 
explicitly about task analysis. Moreover, as task 
analysis is at the heart of virtually all HCI because 
it is fundamentally about the performance of sys- 
tems, then whether called task analysis or not, nearly 
all the published FICI literature is concerned in some 
way with the concept of tasks and their analysis. 

REFERENCES 

Anderson, R., Carroll, J., Grudin, J., McGrew, J., & 
Scapin, D. (1990). Task analysis: The oft missed 
step in the development of computer-human inter- 
faces. Its desirable nature, value and role. Human- 
Computer Interaction: Interact’90, 1051-1054. 

Annett, J. (2002). A note on the validity and reliabil- 
ity of ergonomic methods. Theoretical Issues in 
Ergonomics Science, 3, 228-232. 

Annett, J. (2003). Hierarchical task analysis. In E. 
Hollnagel (Ed.), Handbook of cognitive task de- 
sign (pp. 17-36). Mahwah, NJ: Lawrence Erlbaum 
Associates. 

Annett, J. (2004). Hierarchical task analysis. In D. 
Diaper & N. A. Stanton (Eds.), The handbook of 
task analysis for human-computer interaction 
(pp. 67-82). Mahwah, NJ: Lawrence Erlbaum Asso- 
ciates. 

Annett, J., & Duncan, K. D. (1967). Task analysis 
and training design. Occupational Psychology, 41, 
211 - 221 . 

Annett, J., Duncan, K. D., Stammers, R. B., & Gray, 
M. J. (1971). Task analysis (Training Information 
Paper No. 6). London: HMSO. 

Annett, J., & Stanton, N. A. (1998). Special issue: 
Taskanalysis [Editorial]. Ergonomics, 47(11), 1529- 
1536. 

Benyon, D., & Macaulay, C. (2002). Scenarios and 
the HCI-SE design problem. Interacting with Com- 
puters, 14(4), 397-405. 



584 



Task Analysis at the Heart of Human-Computer Interaction 



Card, S., Moran, T., & Newell, A. (1983). The 
psychology of human-computer interaction. 
Mahwah, NJ: Lawrence Erlbaum Associates. 

Carroll, J. M. (2000). Making use: Scenario-based 
design for human-computer interactions. Cam- 
bridge: MA: MIT Press. 

Cordingley, E. (1989). Knowledge elicitation tech- 
niques for knowledge-based systems. In D. Diaper 
(Ed.), Knowledge elicitation: Principles, tech- 
niques and applications (pp. 87-176). West Sus- 
sex, UK: Ellis Horwood. 

Coronado, J., & Casey, B. (2004). A multicultural 
approach to task analysis: Capturing user require- 
ments for a global software application. In D. Diaper 
& N. A. Stanton (Eds.), The handbook of task 
analysis for human-computer interaction (pp. 
179-192). Mahwah, NJ: Lawrence Erlbaum Associ- 
ates. 

Degen, H., & Pedell, S. (2004). The JIET design for 
e-business applications. In D. Diaper & N. A. 
Stanton (Eds.), The handbook of task analysis for 
human-computer interaction (pp. 193-220). 
Mahwah, NJ: Lawrence Erlbaum Associates. 

Diaper, D. (1989a). Task analysis for knowledge 
descriptions (TAKD): The method and an example. 
In D. Diaper (Ed.), Task analysis for human- 
computer interaction (pp. 108-159). West Sussex, 
UK: Ellis Horwood. 

Diaper, D. (1989b). Task observation for human- 
computer interaction. In D. Diaper (Ed.), Task 
analysis for human-computer interaction (pp. 
210-237). West Sussex, UK: Ellis Horwood. 

Diaper, D. (1990). Analysing focused interview data 
with task analysis for knowledge descriptions 
(TAKD). Human-Computer Interaction: Inter- 
act ’90, 277-282. 

Diaper, D. (2001). Task analysis for knowledge 
descriptions (TAKD): A requiem for a method. 
Behaviour and Information Technology, 20(3), 
199-212. 

Diaper, D. (2002a). Scenarios and task analysis. 
Interacting with Computers, 14(A), 379-395. 

Diaper, D. (2002b). Task scenarios and thought. 
Interacting with Computers, 14(5), 629-638. 



Diaper, D. (2002c). Waves of task analysis. Inter- 
faces, 50, 8-10. 

Diaper, D. (2004). Understanding task analysis for 
human-computer interaction. In D. Diaper & N. A. 
Stanton (Eds.), The handbook of task analysis for 
human-computer interaction (pp. 5-47). Mahwah, 
NJ: Lawrence Erlbaum Associates. 

Diaper, D., May, J., Cockton, G., Dray, S., Benyon, 
D., Bevan, N., et al. (2003). Exposing, exploring, 
exploding task analysis myths. HCI2003 Proceed- 
ings (Vol. 2, pp. 225-226). 

Diaper, D., & Stanton, N. A. (2004a). The hand- 
book of task analysis for human-computer inter- 
action. Mahwah, NJ: Lawrence Erlbaum Associ- 
ates. 

Diaper, D., & Stanton, N. A. (2004b). Wishing on a 
star: The future of task analysis for human-com- 
puter interaction. In D. Diaper & N. A. Stanton 
(Eds.), The handbook of task analysis for human- 
computer interaction (pp. 603-619). Mahwah, NJ: 
Lawrence Erlbaum Associates. 

Dix, A., Ramduny-Ellis, D., & Wilkinson, J. (2004). 
Trigger analysis: Understanding broken tasks. In D. 
Diaper & N. A. Stanton (Eds.), The handbook of 
task analysis for human-computer interaction 
(pp. 381-400). Mahwah, NJ: Lawrence Erlbaum 
Associates. 

Dowell, J., & Long, J. (1989). Towards a conception 
for an engineering discipline of human factors. Er- 
gonomics, 52(11), 1513-1535. 

Gilbreth, F. B. (1911). Motion study. Princeton, NJ: 
Van Nostrand. 

Greenberg, S. (2004). Working through task-centred 
system design. In D. Diaper & N. A. Stanton (Eds.), 
The handbook of task analysis for human-com- 
puter interaction (pp. 49-68). Mahwah, NJ: 
Lawrence Erlbaum Associates. 

Hollnagel, E. (2003a). Handbook of cognitive task 
design. Mahwah, NJ: Lawrence Erlbaum Associ- 
ates. 

Hollnagel, E. (2003b). Prolegomenon to cognitive 
task design. In E. Hollnagel (Ed.), Handbook of 
cognitive task design (pp. 3-15). Mahwah, NJ: 
Lawrence Erlbaum Associates. 



585 



Task Analysis at the Heart of Human-Computer Interaction 



John, B. E., & Kieras, D. E. (1996). Using GOMS 
for user interface design and evaluation: Which 
technique? ACM Transactions on Computer-Hu- 
man Interaction, 3, 320-351. 

Johnson, H., & Johnson, P. (1987). The develop- 
ment of task analysis as a design tool: A method 
for carrying out task analysis (ICL Report). Un- 
published manuscript. 

Johnson, P., Diaper, D., & Long, J. (1984). Tasks, 
skills and knowledge: Task analysis for knowledge 
based descriptions. Interact ’84: First IFIP Con- 
ference on Human-Computer Interaction, 1, 23- 
27. 

Karat, J., Karat, C.-M., & Vergo, J. (2004). Expe- 
riences people value: The new frontier for task 
analysis. In D. Diaper & N. A. Stanton (Eds.), The 
handbook of task analysis for human-computer 
interaction (pp. 585-602). Mahwah, NJ: Lawrence 
Erlbaum Associates. 

Kieras, D. (2004). GOMS models for task analysis. 
In D. Diaper & N. A. Stanton (Eds.), The hand- 
book of task analysis for human-computer inter- 
action (pp. 83-116). Lawrence Erlbaum Associ- 
ates. 

Levinson, S. C. (1983). Pragmatics. Cambridge, 
MA: Cambridge University Press. 

Limbourg, Q., & Vanderdonkt, J. (2004). Compar- 
ing task models for user interface design. In D. 
Diaper & N. A. Stanton (Eds.), The handbook of 
task analysis for human-computer interaction 
(pp. 135-154). Mahwah, NJ: Lawrence Erlbaum 
Associates. 

Long, J. (1997). Research and the design of human- 
computer interactions or “What happened to valida- 
tion?” In H. Thimbleby, B. O’Conaill, & P. Thomas 
(Eds.), People and computers XII (pp. 223-243). 
New York: Springer. 

Ormerod, T. C., & Shepherd, A. (2004). Using task 
analysis for information requirements specification: 
The sub-goal template (SGT) method. In D. Diaper 
& N. A. Stanton (Eds.), The handbook of task 
analysis for human-computer interaction (pp. 
347-366). Lawrence Erlbaum Associates. 



Shepherd, A. (2001). Hierarchical task analysis. 
London: Taylor and Francis. 

Stanton, N. A. (2002). Developing and validating 
theory in ergonomics. Theoretical Issues in Ergo- 
nomics Science, 3, 111-114. 

Stanton, N. A. (2004). The psychology of task 
analysis today. In D. Diaper & N. A. Stanton (Eds.), 
The handbook of task analysis for human-com- 
puter interaction (pp. 569-584). Mahwah, NJ: 
Lawrence Erlbaum Associates. 

Stanton, N. A., & Young, M. S. (1999). What price 
ergonomics? Nature, 399, 197-198. 

Sutcliffe, A. (2003). Symbiosis and synergy? Sce- 
narios, task analysis and reuse of HCI knowledge. 
Interacting with Computers, 15(2), 245-264. 

Taylor, F. W. (1912). Principles of scientific man- 
agement. New York: Harper and Row. 

Wilson, M., Barnard, P., & MacLean, A. (1986). 
Task analysis in human-computer interaction 
(Rep. No. HF122). IBM Hursely Human Factors. 

Wong, B .L. W. (2004). Critical decision method 
data analysis. In D. Diaper & N. A. Stanton (Eds.), 
The handbook of task analysis for human-com- 
puter interaction (pp. 569-584). Mahwah, NJ: 
Lawrence Erlbaum Associates. 

KEY TERMS 

Activity List: A prose description of a task or 
subtask divided into lines to represent separate task 
behaviors and that usually has only one main agent 
and one action per line. 

Application Domain: That part of the assumed 
real world that is changed by a work system to 
achieve the work system’s goals. 

Goal: A specification of the desired changes a 
work system attempts to achieve in an application 
domain. 

Performance: The quality, with respect to both 
errors and time, of work. 

Subtask: A discrete part of a task. 
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Task: The mechanism by which an application 
domain is changed by a work system to achieve the 
work system’s goals. 

Work: The change to an application domain by 
a work system to achieve the work system's goals. 

Work System: That part of the assumed real 
world that attempts to change an application domain 
to achieve the work system’s goals. 
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INTRODUCTION 

In ontological engineering research field, the con- 
cept of “task ontology” is well-known as a useful 
technology to systemize and accumulate the knowl- 
edge to perform problem-solving tasks (e.g. , diagno- 
sis, design, scheduling, and so on). A task ontology 
refers to a system of a vocabulary/concepts used as 
building blocks to perform a problem-solving task in 
a machine readable manner, so that the system and 
humans can collaboratively solve a problem based 
on it. 

The concept of task ontology was proposed by 
Mizoguchi (Mizoguchi,Tijerino, &Ikeda, 1992, 1995) 
and its validity is substantiated by development of 
many practical knowledge-based systems (Hori & 
Yoshida, 1998; Ikeda, Seta, & Mizoguchi, 1997; 
Izumi &Yamaguchi, 2002; Schreiber et al., 2000; 
Seta, Ikeda, Kakusho, & Mizoguchi, 1997). He 
stated: 

...task ontology characterizes the computational 
architecture of a knowledge-based system which 
performs a task. The idea of task ontology which 
serves as a system of the vocabulary/concepts 
used as building blocks for knowledge-based 
systems might provide an effective methodology 
and vocabulary for both analyzing and 
synthesizing knowledge-based systems. It is useful 
for describing inherent problem-solving structure 
of the existing tasks domain-independently. It is 
obtained by analyzing task structures of real 
world problem. ... The ultimate goal of task 
ontology research is to provide a theory of all the 
vocabulary/concepts necessary for building a 
model of human problem solving processes. 
(Mizoguchi, 2003) 

We can also recognize task ontology as a static 
user model (Seta et al., 1997), which captures the 



meaning of problem-solving processes, that is, the 
input/output relation of each activity in a problem- 
solving task and its effects on the real world as well 
as on the humans’ mind. 



BACKGROUND 

Necessity of Building Task Ontologies 
as a Basis of HCI 

It is extremely difficult to develop an automatic 
problem-solving system that can cope with a variety 
of problems. The main reason is that the knowledge 
for solving a problem varies considerably depending 
on the nature of the problems. This engenders a fact 
that is sometimes ignored: Users have more knowl- 
edge than computers. From this point of view, the 
importance of a user-centric system (DeBells, 1995) 
is now widely recognized by many researchers. 
Such framework follows a collaborative, problem- 
solving-based approach between human and com- 
puter by establishing harmonious interaction be- 
tween human and computer. 

Many researchers implement such a framework 
with a human-friendly interface using multimedia 
network technologies. Needless to say, it is impor- 
tant not only to apply the design principles of the 
human interface but also principle knowledge for 
exchanging meaningful information between hu- 
mans and computers. 

Systems have been developed to employ re- 
search results of the cognitive science field in order 
to design usable interfaces that are acceptable to 
humans. However, regarding the content-oriented 
view, it is required that the system can understand 
the meaning of human’ s cognitive activities in order 
to capture a human’s mind. 

We, therefore, need to define a cognitive model, 
that is, to define the cognitive activities humans 
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perform in a problem-solving/decision-making pro- 
cess and the information they infer, and then system- 
ize them as task ontologies in a machine understand- 
able manner in order to develop an effective human- 
computer interaction. 

Problem-Solving Oriented Learning 

A task with complicated decision making is referred 
to as “Problem-Solving Oriented Learning (PSOL) 
task” (Seta, Tachibana, Umano, & Ikeda, 2003; Seta 
& Umano, 2002). Specifically, this refers to a task 
that does not only require learning to build up suffi- 
cient understanding for planning and performing 
problem-solving processes but also to gain the abil- 
ity/skill of making efficient problem-solving deci- 
sions based on sophisticated strategies. 

Consider for example, a learner who is not very 
familiar with Java and XML programming and tries 
to develop an XML-based document retrieval sys- 
tem. A novice learner in a problem-solving domain 
tries to gather information from Web resources, 
investigates and builds up his/her own understanding 
of the target area, and makes plans to solve the 
problem at hand and then perform problem-solving 
and learning processes. Needless to say, a complete 
plan cannot be made at once, but is detailed gradually 
by iterating, spirally, those processes while applying 
a “trial and error” approach. Thus, it is important for 
a learner to control his/her own cognitive activities. 



Facilitating Learners’ Meta Cognition 
through HCI 

In general, most learners in PSOL tend to work in an 
ad hoc manner without explicit awareness of mean- 
ing, goals and roles of their activities. Therefore, it is 
important to prompt construction of a rational spiral 
towards making and performing efficient problem- 
solving processes by giving significant direction 
using HCI. 

Many researchers in the cognitive science field 
proposed a concept whereby metacognition plays an 
important role to acquire and transfer expertise 
(Brown, Bransford, Ferrara, & Campione, 1983; 
Flavell, 1976; Okamoto, 1999). Furthermore, re- 
peated interaction loops between metacognition ac- 
tivities and cognition activities play an important 
role in forming an efficient plan for problem-solving 
and learning processes. 

Figure 1 shows the plan being gradually detailed 
and refined along the time axis. Figure 1(a) is a 
planning process when a learner has explicit aware- 
ness of interactions and iterate metacognition activi- 
ties and cognition activities spirally, while Figure 
1 (b) is a planning process with implicit awareness of 
them. In PSOL, monitor and control of problem- 
solving/learning processes are typical activities of 
metacognition while their performances are ones of 
cognition. It is natural that the former case allows 
efficient plans for problem-solving workflow more 



Figure 1. The interaction helps the effective planning process 
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Figure 2. Rasmussen’s (1986) cognitive model Rasmussen ’s (1986) Cognitive Model 




rapidly than the latter. Without explicit awareness of 
interaction loops, a learner tends to get confused and 
lose his/her way because nested structures of his/her 
work and new information of the target world impose 
heavy loads. 

Therefore, it is important to implement an HCI 
framework that enables effective PSOL by position- 
ing a learner at the center of the system as a subject 
of problem solving or learning, and providing appro- 
priate information to prompt the learner’s 
metacognition effectively. 



MAIN ISSUES IN TASK 
ONTOLOGY-BASED HCI 

In this section, we introduce our approach to support- 
ing PSOL to understand task ontology based HCI 
framework. 

Rasmussen’s (1986) cognitive model is adopted 
as a reference model in the construction of the task 
ontology for supporting PSOL. It simulates the pro- 
cess of human cognition in problem-solving based on 
cognitive psychology. Cognitive activity in PSOL is 
related to this model based on which PSOL task 
ontology is constructed. This provides a learner with 
useful information for effective performance of cog- 
nitive activity at each state, according to the theoreti- 
cal framework that was revealed in the cognitive 
psychology. 



Figure 2 represents an outline of Rasmussen’s 
cognitive model known as the ladder model. 

Activities in PSOL broadly comprise activities in 
connection with a problem-solving act and activities 
in connection with a learning act (see Figure 3 in the 
next section). 

An Activation activity in Rasmussen’s cogni- 
tive model corresponds to the situation in which a 
problem is given in problem-solving activities, or 
one in which a learner detects change in the real 
world. An Observe activity corresponds to observ- 
ing the details of the change or a gap from the 
problem-solving goal. An Identify activity corre- 
sponds to identifying its possible cause. An Inter- 
pret activity corresponds to interpreting the influ- 
ence of the change on problem solving and deciding 
the problem-solving goal. A Define Task activity 
corresponds to determine a problem-solving task 
for implementing it based on the problem-solving 
goal. A Formulate Procedure activity corresponds 
to setting up a problem-solving plan to solve the 
problem-solving task. 

Although basically the same correspondence 
applies in learning activities as the case of problem- 
solving activities, the object of learning activities, 
mainly focuses on the state of one’s own knowl- 
edge or understanding, that is, metacognition activi- 
ties. Namely, the Activation activity in Rasmussen’ s 
cognitive model corresponds to detecting the change 
of one’s own knowledge state. The Observe activ- 
ity corresponds to observing details or a gap from its 
own understanding state (goal state) decided as a 
goal of learning. The Identify activity corresponds 
to identifying its possible cause. The Interpret ac- 
tivity corresponds to interpreting the influence of its 
own understanding state, especially the influence 
on problem solving in PSOL, and deciding the goal 
of learning. The Define Task activity corresponds 
to setting up a learning task for implementing it 
based on the problem-solving goal. The Formulate 
Procedure activity corresponds to setting up a 
learning plan to solve the problem-solving task. 

Clarifying a correspondence relationship be- 
tween the cognitive activity by a learner in PSOL 
and the cognitive activity in Rasmussen’ s cognitive 
model permits construction of a problem-solving- 
oriented learning task ontology as a basis of human- 
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Figure 3. Learner’s work in problem-solving-oriented learning (PSOL) 




computer interaction comprehending the properties 
of PSOL appropriately. Implementing an interaction 
between a system and a learner based on this allows 
the system to show effective information to encour- 
age the learner’s appropriate decision-making. 

Cognitive Model in Problem-Solving 
Oriented Learning 

Figure 3 shows a cognitive model that captures 
detailed working processes of a learner. This model 
is PSOL task specific while Rasmussen’s model is a 
task independent one. By making the correspon- 
dence between these models, we can define an FICI 
framework based on Rasmussen’s theory. 

Figures 3(i) and 3(iii) represent the planning 
process of the problem-solving plan and learning 
plan, respectively, and 3(viii) and 3(x) represent 
problem-solving and learning processes in Figure 1 , 
respectively. Figures 3(v) and 3(vi) represent the 
monitoring process. 

We have presented a problem, say, “developing 
an XML based document retrieval system” in the 
upper left corner. Two virtual persons, a problem- 
solving planner and learning process planner in the 



learner, play roles of planning, monitoring, and con- 
trolling problem-solving, and learning processes, re- 
spectively. 

With PSOL, a learner first defines a problem- 
solving goal and refines it to sub-goals which con- 
tribute to achieving goal G (Figure 3(i)). They are 
refined to feasible problem-solving plans (Figure 
3(ii)); thereafter, the learner performs them to solve 
the problem (Figure 3(viii)). 

If the learner recognizes a lack of knowledge in 
the sub goals and performs problem-solving plans, 
we can generate an adequate learning goal (LG) to 
get knowledge (Figure 3(iii)) and refine it to learning 
process plans (Figure 3(iv)). In learning processes 
(Figure 3(x)), s/he constructs knowledge (Figure 
3(iv)) to be required to plan and perform the prob- 
lem-solving process. Based on constructed knowl- 
edge, she or he specifies and performs the problem- 
solving processes (Figure 3(viii)), to change the real 
world (Figure 3(vii)). The learner assesses gaps 
among goal states (GS), current goal states (CGS) of 
problem-solving process plans, and current state (c- 
state) of the real-world (Figure 3(v)) and ones 
among learning goal states (LGS), current learning 
goal states (CLGS) of learning process plans and 



591 



Task Ontology-Based Human-Computer Interaction 



understanding state (Figure 3(vi)). She or he con- 
tinuously iterates these processes until the c-state of 
the real world satisfies the GS of problem solving. 

It is notable that learners in PSOL have to make 
and perform not only problem-solving plans, but also 
learning plans in the process of problem solving. 
Furthermore, it is important for the learner to moni- 
tor real-world changes by performing problem-solv- 
ing processes and to monitor his/her own under- 
standing states by performing learning processes 
and checking and analyzing whether states of the 
real world and understanding states satisfy defined 
goal states (Figures 3(v) and 3(vi)). The gap be- 
tween current states and goal states causes the 
definition of new goals to be dissolved. 

Consequently, PSOL impels a learner to perform 
complicated tasks with heavy cognitive loads. A 
learner needs to manage and allocate the attentional 
capacity adequately because of limited human 
attentional capacity. This explains why a novice 
learner tends to get confused and lose his/her way. 



Task Ontology for Problem-Solving- 
Oriented Learning 

Figure 4 presents an outline of the PSOL Task 
Ontology (Problem-Solving-Oriented Learning Task 
Ontology). Ovals in the figure express a cognitive 
activity performed by a learner in which a link 
represents an “is-a” relationship. 

The PSOL task ontology defines eight cognitive 
processes modeled in Rasmussen’ s cognitive model 
as lower concepts (portions in rectangular box (a) in 
the figure). They are refined through an is-a hierar- 
chy to cognitive activities on the meta-level (meta 
activity), and cognitive activities on the object level 
(base activity). Moreover, they are further refined in 
detail as their lower concepts: a cognitive activity in 
connection with learning activities and a cognitive 
activity in connection with problem-solving activi- 
ties. Thereby, a conceptual system is constructed 
that reflects the task structure of PSOL. For ex- 
ample, typical metacognition activities that a learner 



Figure 4. A hierarchy of problem-solving- oriented learning task ontology 
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Figure 5. A definition of “identify c-state of NT in executable plan ” 
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performs in PSOL, such as “Monitor knowledge 
state” and “Monitor learning plan,” are systematized 
as lower concepts of metacognition activities in the 
Observe activity. 

Figure 5 shows a conceptual definition of an act 
that identifies a possible cause of why a plan is 
infeasible. All the concepts in Figure 5 have a 
conceptual definition in a machine readable manner 
like this, thus, the system can understand what the 
learner tries to do and what information he/she 
needs. 

Cause identification activities defined include: 
the actor of the activity is a learner; a learner’s 
awareness of infeasibility becomes an input 
(in%symptom in Figure 5); the lower plan of an 
target plan that the learner tries to make it feasible 
now is made into a reference information 
(in%reference in Figure 5). Moreover, this cognitive 
activity stipulates that a learner’s awareness of 
causes of infeasiblity is output (out%cause in Figure 
5). The definition also specifies that the causes of 
the infeasibility include (axioms in Figure 5): that the 
sufficiency of that target plan is not confirmed 
(cause 1 in Figure 5); that the feasibility of a lower 
plan, small grained plan that contrib utes to realize the 
target plan, is not confirmed (cause2 in Figure 5); 
and that the target plan is not specified (cause3 in 
Figure 5). Based on this machine understandable 



definition, the system can suggest the candidate 
causes of infeasibility of the object plan, and the 
information the learner should focus on. 

Making this PSOL task ontology into the basis of 
a system offers useful information in the situation 
that encourages appropriate decision-making. This 
is one of the strong advantages using PSOL task 
ontology. 

An Application: Planning Navigation as 
an Example 

The screen image of Kassist, a system based on the 
PSOL Task Ontology, is shown in Figure 6. Kassist 
is an interactive open learner-modeling environ- 
ment. The system consists of six panels. A learner 
describes a problem-solving plan, own knowledge 
state about the object domain, and a learning process 
in each panels of (a), (b), and (c), respectively. 
Furthermore, a learner can describe the correspon- 
dence relationship between the problem-solving pro- 
cess placed at (a) and the concept of (b), that is, the 
correspondence relationship with the knowledge of 
the object domain required for carrying out the 
process of (a); and the correspondence relationship 
between the learning process placed at (c), and the 
concept of (b), that is, the correspondence relation- 
ship with the learning process of (c) which con- 
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Figure 6. Interactive navigation based on problem solving oriented learning task ontology 
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structs an understanding on the concept of (b). Each 
shaded node in (b) represents either “knowing” or 
“not knowing” the concept. A learner can describe 
the correspondence of the concepts and processes 
in the object world placed on (a), (b), and (c) with 
resource (f), used as those reference information, so 
that appropriate information can be referred to when 
required. 

This provides a learner with an environment in 
which she or he can externalize and express her own 
knowledge; it then encourages his or her spontane- 
ous metacognition activities such as the Activation 
activity and Observation activity in Rasmussen’s 
model. Moreover, we can implement a more positive 
navigation function that encourages a learner’s 
metacognition activity in the subsequent cognitive 
process by making ontology the basis of a system. 

Consider, for example, this task “Investigate how 
to develop XML-based document retrieval system”. 
Assume a situation where a learner does not know 
how to tackle this task: 

i. In this situation, a learner clicks the “Investi- 
gate how to develop XML-based document 
retrieval system” node on (c); among the lower 



learning plans connected by the “part-of ’ links, 
a plan “Learn how to design module structure”, 
whose feasibility is not secured is highlighted 
based on the ontology; and a learner is shown 
the causes of infeasibility with a message “con- 
nected lower plan is not feasible”. 

ii. Then, it shows the cause as a learner has lack 
of knowledge to specify the plan. 

iii. Moreover, plans influenced by the infeasibility 
of this learning plan are displayed in the inter- 
pretation process. 

iv. Here, a problem-solving plan “Design module 
structure” is highlighted. Such navigation al- 
lows a learner to comprehend knowledge re- 
quired to carry out problem-solving and to 
understand at what stage in a problem-solving 
process such knowledge is needed and their 
influence. 

Thus, a learner can conduct appropriate deci- 
sion-making by acquiring detailed knowledge based 
on this modular design method. 

A series of cognitive activities are typical 
metacognition activities in PSOL. They include: a 
learner’s awareness of feasibility of a learning pro- 
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cess as a start; monitoring one’s own knowledge 
state; comprehending its influence on a problem- 
solving plan; and building a learning plan for master- 
ing knowledge required for problem solving. Refer- 
ence to the appropriate information offered by a 
system to a learner encourages his or her appropri- 
ate metacognition activities, which help implement 
effective PSOL. 



FUTURE TRENDS 
Ontology-Aware System 

The systems which support users to perform intelli- 
gent tasks based on the understanding of ontologies 
are called “ontology aware systems” (Hayashi, 
Tsumoto, Ikeda, & Mizoguchi, 2003). Systemizing 
ontologies contributes to providing theories and mod- 
els, which are human-orientated to enhance sys- 
tems’ abilities of explanation and reasoning. Fur- 
thermore, from the viewpoint of system develop- 
ment, building systems with explicit ontologies would 
enhance their maintainability and extendability . There- 
fore, future work in this field should continue devel- 
oping systems that integrate ontology and HCI more 
effectively. 

CONCLUSION 

This article introduced a task ontology based human 
computer interaction framework and discussed vari- 
ous related issues. Flowever, it is still difficult and 
time consuming to build high quality sharable ontolo- 
gies that are based on the analysis of users’ task 
activities. Thus, it is important to continue building 
new methodologies for analyzing users’ tasks. This 
issue should be carefully addressed in the future, and 
we hope more progress can be achieved through 
collaboration between researchers in the fields of 
ontology engineering and human computer interaction. 
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KEY TERMS 

Attentional Capacity: Cognitive capacity di- 
vided and allocated to perform cognitive task. 

Metacognition: Cognition about cognition. It 
includes monitoring the progress of learning, check- 
ing the status of self-knowledge, correcting self- 
errors, analyzing the effectiveness of the learning 
strategies, controlling and changing self-learning 
strategies, and so on. 

Ontology: A specification of a conceptualization 
(Gruber, 1993). 

Problem-Solving Oriented Learning (PSOL): 

Learning not only to build up sufficient understand- 
ing for planning and performing problem-solving 
processes but also to gain the capacity of making 
efficient problem-solving processes according to a 
sophisticated strategy. 

Rasmussen’s Ladder Model: A cognitive 
model that models human’s decision-making pro- 
cesses. This model is often used for human error 
analysis. 

Task Ontology: A system of vocabulary/con- 
cepts used as building blocks for knowledge-based 
systems. 
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INTRODUCTION 

Daily use of computer systems often has been 
hampered by poorly designed user interfaces. Since 
the functionality of a computer system is made 
available through its user interface, its design has a 
huge influence on the usability of these systems 
(Carroll, 2002; Preece, 2002). From the user’s per- 
spective, the user interface is the only visible and, 
hence, most important part of the computer system; 
thus, it receives high priority in designing computer 
systems. 

A plea for human-oriented design in which the 
potentials of computer systems are tuned to the 
intended user in the context of their utilization has 
been made (Rossen & Carroll, 2002). 

An analysis of the strategies that humans use in 
performing tasks that are to be computer-supported 
is a key issue in hunran-oriented design of user 
interfaces. Good interface design thus requires a 
deep understanding of how humans perform a task 
that finally will be computer-supported. These in- 
sights then may be used to design a user interface 
that directly refers to their information processing 
activities. A variety of methodologies and tech- 
niques can be applied to analyze end users’ informa- 
tion processing activities in the context of a specific 
task environment among user-centered design meth- 
odologies. More specifically, cognitive engineering 
techniques are promoted to improve computer sys- 
tems’ usability (Gerhardt-Powels, 1996; Stary & 
Peschl, 1998). 

Cognitive engineering as a field aims at under- 
standing the fundamental principles behind human 
activities that are relevant in the context of designing 
a system that supports these activities (Stary & 
Peschl, 1998). The ultimate goal is to develop end 
versions of computer systems that support users of 
these systems to the maximum in performing tasks in 
such a way that the intended tasks can be accom- 



plished with minimal cognitive effort. Empirical re- 
search has indeed shown that cognitively engineered 
interfaces are considered superior by users in terms 
of supporting task performance, workload, and sat- 
isfaction, compared to non-cognitively engineered 
interfaces (Gerhardt-Powels, 1996). Methods such 
as the think aloud method, verbal protocol analysis, 
or cognitive task analysis are used to analyze in 
detail the way in which humans perform tasks, 
mostly in interaction with a prototype computer 
system. 

BACKGROUND 

In this section, we describe how the think aloud 
method can be used to analyze a user’s task behav- 
ior in daily life situations or in interaction with a 
computer system and how these insights may be 
used to improve the design of computer systems. 
Thereafter, we will go into the pros and cons of the 
think aloud method. 

The Think Aloud Method 

Thinking aloud is a method that requires subjects to 
talk aloud while solving a problem or performing a 
task (Ericsson & Simon, 1993). This method tradi- 
tionally had applications in psychological and educa- 
tional research on cognitive processes. It is based on 
the idea that one can observe human thought pro- 
cesses that take place in consciousness. Thinking 
aloud, therefore, may be used to know more about 
these cognitive processes and to build computer 
systems on the basis of these insights. Overall, the 
method consists of ( 1 ) collecting think aloud reports 
in a systematic way and (2) analyzing these reports 
to gain a deeper understanding of the cognitive 
processes that take place in tackling a problem. 
These reports are collected by instructing subjects to 
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solve a problem while thinking aloud; that is, stating 
directly what they think. The data so gathered are 
very direct; there is no delay. These verbal utter- 
ances are transcribed, resulting in verbal protocols, 
which require substantial analysis and interpretation 
to gain deep insight into the way subjects perform 
tasks (Deffner, 1990). 

The Use of the Think Aloud Method in 
Computer System Design 

In designing computer systems, the think aloud 
method can be used in two ways: (1) to analyze 
users’ task behaviors in (simulated) working prac- 
tices, after which a computer system is actually built 
that will support the user in executing similar tasks in 
future; or (2) to reveal usability problems that a user 
encounters in interaction with a (prototype) com- 
puter system that already supports the user in per- 
forming certain tasks. 

In both situations, the identification and selection 
of a representative sample of (potential) end users is 
crucial. The subject sample should consist of per- 
sons who are representative of those end users who 
will actually use the system in the future. This 
requires a clearly defined user profile, which de- 
scribes the range of relevant skills of system users. 
Computer expertise, roles of subjects in the work- 
place, and a person’s expertise in the domain of 
work that the computer system will support are 
useful dimensions in this respect (Kushnirek & 
Patel, 2004). A questionnaire may be given either 
before or after the session to obtain this information. 
As the think aloud method provides a rich source of 
data, a small sample of subjects (eight to 10) suffices 
to gain a thorough understanding of task behavior 
(Ericsson & Simon, 1993) or to identify the main 
usability problems with a computer system (Boren & 
Ramey, 2000). A representative sample of the tasks 
to be used in the think aloud study is likewise 
essential. Tasks should be selected that end users 
are expected to perform while using the (future) 
computer system. This requirement asks for a care- 
ful design of tasks to be used in the study to assure 
that tasks are realistic and representative of daily life 
situations. It is recommended that task cases be 
developed from real-life task examples (Kushnirek 
& Patel, 2004). 



Instructions to the subjects about the task at hand 
should be given routinely. The instruction on thinking 
aloud is straightforward. The essence is that the 
subject performs the task at hand, possibly sup- 
ported by a computer, and says out loud what comes 
to mind. 

A typical instruction would be, “I will give you a 
task. Please keep talking out loud while performing 
the task.” Although most people do not have much 
difficulty rendering their thoughts, they should be 
given an opportunity to practice talking aloud while 
performing an example task. Example tasks should 
not be too different from the target task. As soon as 
the subject is working on the task, the role of the 
instructor is a restrained one. Interference should 
occur only when the subject stops talking. Then, the 
instructor should prompt the subject by the following 
instruction: “Keep on talking” (Ericsson & Simon, 
1993). 

Full audiotaping and/or videorecording of the 
subject’s concurrent utterances during task perfor- 
mance and, if relevant, videorecording of the com- 
puter screens are required to capture all the verbal 
data and user/computer interactions in detail. After 
the session has been recorded, it has to be tran- 
scribed. Typing out complete verbal protocols is 
inevitable to be able to analyze the data in detail (Dix 
et al., 1998). Videorecordings may be viewed infor- 
mally, or they may be analyzed formally to under- 
stand fully the way the subject performed the task or 
to detect the type and number of user-computer 
interaction problems. 

The use of computer-supported tools that are 
able to link the verbal transcriptions to the corre- 
sponding video sequences may be considered to 
facilitate the analysis of the video data (Preece, 
2002). 

Prior to analyzing the audio and/or video data, it 
is usually necessary to develop a coding scheme to 
identify step-by-step how the subject tackled the 
task and/or to identify specific user/computer inter- 
action problems in detail. Coding schemes may be 
developed bottom-up or top-down. In a bottom-up 
procedure, one would use part of the protocols to 
generate codes by taking every new occurrence of 
a cognitive subprocess code. For example, one could 
assign the code guessing to the following verbal 
statements: “Could it be X?” or “Let’s try X.” The 
remaining protocols then would be analyzed by using 
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Figure 1. Excerpt from a coded verbal protocol for analyzing human task behavior 
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this coding scheme. An excerpt from a coded verbal 
protocol is given in Figure 1. Note that the verbal 
protocol is marked up with annotations from the 
coding scheme. Otherwise, categories in the coding 
scheme may be developed top-down, for example, 
from examination of categories of interactions from 
the human/computer interaction literature (Kushnirek 
& Patel, 2004). Before it is applied, a coding scheme 
must be evaluated on its intercoder reliability. 

To prevent experimenter bias, it is best to leave 
the actual coding of the protocols to a minimum of 
two independent coders. Correspondence among 
codes assigned by different coders to the same 
verbal statements must be found, for which the 
Kappa mostly is used (Altman, 1991). 

The coded protocols and/or videos can be com- 
piled and summarized in various ways, depending on 
the goal of the study. If the goal is to gain a deep 
insight into the way humans perform a certain task in 
order to use these insights for developing a computer 
system to support task performance, then the proto- 
col and video analyses can be used as input for a 
cognitive task model. Based on this model, a first 
version of a computer system then may be designed. 
If the aim is to evaluate the usability of a (prototype) 
computer system, the results may summarize any 
type and number of usability problems revealed. If 
the computer system under study is still under devel- 
opment, these insights then may be used to better the 
system. 

PROS AND CONS OF THE THINK 
ALOUD METHOD 

The think aloud method, preferably used in combina- 
tion with audio- and/or videorecording, is one of the 



most useful methods to gain a deep understanding 
of the way humans perform tasks and of the spe- 
cific user problems that occur in interaction with a 
computer system. As opposed to other inquiry 
techniques, the think aloud method requires little 
expertise, while it provides detailed insights regard- 
ing human task behavior and/or user problems with 
a computer system (Preece, 2002). On the other 
hand, the information provided by the subjects is 
subjective and may be selective. Therefore, a care- 
ful selection of the subjects who will participate and 
the tasks that will be used in the study is crucial. In 
addition, the usefulness of the think aloud method is 
highly dependent on the effectiveness of the re- 
cording method. For instance, with audiotaping 
only, it may be difficult to record information that is 
relevant to identify step-by-step what the subjects 
were doing while performing a task, whether com- 
puter-supported or not (Preece, 2002). 

Another factor distinguishing the think aloud 
method from other inquiry techniques is the prompt- 
ness of the response it provides. The think aloud 
method records the subject’s task behavior at the 
time of performing the task. Other inquiry tech- 
niques, such as interviews and questionnaires, rely 
on the subject’s recollection of events afterwards. 
Subjects may not be aware of what they actually 
are doing while performing a task or interacting 
with a computer, which limits the usefulness of 
evaluation measures that rely on retrospective self- 
reports (Boren & Ramey, 2000; Preece, 2002). The 
advantage of thinking aloud, whether audio- or 
videotaped, as a data eliciting method includes the 
fact that the resulting reports provide a detailed 
account of the whole process of a subject executing 
a task. 

Although using the think aloud method is rather 
straightforward and requires little expertise, ana- 



599 



The Think Aloud Method and User Interface Design 



lyzing the verbal protocols can be very time-con- 
suming and requires that studies are well planned in 
order to avoid wasting time (Dix et al., 1998). 

The think aloud method has been criticized, par- 
ticularly with respect to the validity and complete- 
ness of the reports it generates (Boren & Ramey, 
2000; Goguen & Linde, 1993). 

An argument made against the use of the think 
aloud method as a tool for system design is that 
humans do not have access to their own mental 
processes and, therefore, cannot be asked to report 
on these. With this notion, verbalizing thoughts is 
viewed as a cognitive process on its own. Since 
humans are poor at dividing attention between two 
different tasks (i.e., performing the task under con- 
sideration and verbalizing their thoughts), it is argued 
that thinking aloud may lead to incomplete reports 
(Nisbett & Wilson, 1997). 

However, this critique seems to bear on some 
types of tasks that subjects are asked to perform in 
certain think aloud studies. As Ericsson and Simon 
( 1993) point out, in general, talking out loud does not 
interfere with task performance and, therefore, does 
not lead to much disturbance of the thought pro- 
cesses. If reasoning takes place in verbal form, then 
verbalizing thoughts is easy and uses no extra human 
memory capacity. However, if the information is 
nonverbal and complicated, verbalization will not 
only cost time but also extra human memory capac- 
ity. Verbalization of thoughts then becomes a cogni- 
tive process by itself. This will cause the report of 
the original task processing to be incomplete, and 
sometimes, it even may disrupt this process (Ericsson 
& Simon, 1993). Therefore, the think aloud method 
only may be used on a restricted set of tasks. Tasks 
for which the information can be reproduced ver- 
bally and for which no information is asked that is not 
directly used by the subject in performing the task 
under attention are suitable for introspection by the 
think aloud method (Boren & Ramey, 2000). 

The fact that the experimenter may interrupt the 
subject during task behavior is considered another 
source of error, leading to distorted reports (Goguen 
& Linde, 1993). It has been shown, however, that as 
long as the experimenter minimizes interventions in 
the process of verbalizing and merely reminds the 
subject to keep talking when a subject stops verbal- 
izing his or her thoughts, the ongoing cognitive 



processes are no more disturbed than by other 
inspection techniques (Ericsson & Simon, 1993). 

The think aloud method, if applied under pre- 
scribed conditions and preferably in combination 
with audio- and/or videorecording, is a valuable 
information source of human task/behavior and, as 
such, a useful technique in designing and evaluating 
computer systems. 

FUTURE TRENDS 

The think aloud method is propagated and far more 
often used as a method for system usability testing 
than as a user requirements eliciting method. In 
evaluating (prototype) computer systems, thinking 
aloud is used to gain insight into end users’ usability 
problems in interaction with a system to better the 
design of these systems. The use of think aloud and 
video analyses, however, may be helpful not merely 
in evaluating the usability of (prototype) computer 
systems but also in analyzing in detail how end users 
tackle tasks in daily life that in the end will be 
computer supported. The outcomes of these kinds of 
analyses may be used to develop a first version of a 
computer system that directly and fully supports 
users in performing these kinds of tasks. Such an 
approach may reduce the time spent in iterative 
design of the system, as the manner in which poten- 
tial end users process tasks is taken into account in 
building the system. 

Although a deep understanding of users’ task 
behaviors in daily settings is indispensable in design- 
ing intuitive systems, we should keep in mind that the 
implementation of computer applications in real-life 
settings may change and may have unforeseen 
consequences for work practices. So, besides in- 
volving potential user groups in an early phase of 
system design and in usability testing, it is crucial to 
gain insight into how these systems may change 
these work practices to evaluate whether and how 
these systems are being used. This adds to our 
understanding of why systems may or may not be 
adopted into routine practice. 

Today, a plea for qualitative studies for studying 
a variety of human and contextual factors that 
likewise may influence system appraisal is made in 
literature (Aarts et al., 2004; Ammenwerth et al., 
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2003; Berg et al., 1998; Orlikowski, 2000; Patton, 
2002). In this context, sociotechnical system design 
approaches are promoted (Aarts et al., 2004; Berg 
et al., 1998; Orlikowski, 2000). Sociotechnical sys- 
tem design approaches are concerned not only with 
human/computer interaction aspects of system de- 
sign but also take psychological, social, technical, 
and organizational aspects of system design into 
consideration. These approaches take an even 
broader view of system design and implementation 
than cognitive engineering approaches — the organi- 
zation is viewed as a system with people and tech- 
nology as components within this system. With 
sociotechnical system design approaches, it can be 
determined which changes are necessary and ben- 
eficial to the system as a whole, and these insights 
then may be used to decide on the actions to effect 
these changes (Aarts et al., 2004; Berg et al., 1998). 
This process of change never stops; even when the 
implementation of a computer system is formally 
finished, users still will ask for system improve- 
ments to fit their particular requirements or interests 
(Orlikowski, 2000). 

CONCLUSION 

The use of the think aloud method may aid in 
designing intuitive computer interfaces, because using 
thinking aloud provides us with a more thorough 
understanding of work practices than do conven- 
tional techniques such as interviews and question- 
naires. 

Until now, thinking aloud was used mostly to 
evaluate prototype computer systems. Thinking aloud, 
however, likewise may be used in an earlier phase of 
system design, even before a first version of the 
system is available. It then can be used to elicit every 
step taken by potential end users to process a task in 
daily work settings. These insights then may be used 
as input to the design of a computer system’s first 
version. 

Development, fine-tuning, testing, and final imple- 
mentation of computer systems take a lot of time and 
resources. User involvement in the whole life cycle 
of information systems is crucial, because only when 
we really try to understand end users’ needs and the 
way they work, think, and communicate with each 
other in daily practice can we hope to improve 
computer systems. 
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KEY TERMS 

Cognitive Engineering: A field aiming at un- 
derstanding the fundamental principles behind hu- 
man activities that are relevant in context of design- 
ing a system that supports these activities. 

Cognitive Task Analysis: The study of the 
way people perform tasks cognitively. 

Cognitive Task Model: A model representing 
the cognitive behavior of people performing a cer- 
tain task. 

Sociotechnical System Design Approach: 

System design approach that focuses on a sociologi- 
cal understanding of the complex practices in which 
a computer system is to function. 

Think Aloud Method: A method that requires 
subjects to talk aloud while solving a problem or 
performing a task. 

User Profile: A description of the range of 
relevant skills of potential end users of a system. 

Verbal Protocol: Transcription of the verbal 
utterances of a test person performing a certain 
task. 

Verbal Protocol Analysis: Systematic analysis 
of the transcribed verbal utterances to develop a 
model of the subject’s task behavior that then may 
be used as input to system design specifications. 

Video Analysis: Analysis of videorecordings of 
the user/computer interactions with the aim to detect 
usability problems of the computer system. 



602 



Tool Support for Interactive Prototyping of 
Safety-Critical Interactive Applications 



603 



Remi Bastide 

Universite Paul Sabatier, France 

David Navarre 

Universite Paul Sabatier, France 

Philippe Palanque 

Universite Paul Sabatier, France 



INTRODUCTION 

The complete specification of interactive applica- 
tions is now increasingly considered a requirement in 
the field of software for safety-critical systems due 
to their use as the main control interface for such 
systems. The reason for putting effort in the use and 
the deployment of formal description techniques lies 
in the fact that they are the only means for both 
modeling in a precise and unambiguous way all the 
components of an interactive application (presenta- 
tion, dialogue, and functional core; Pfaff, 1985) and 
proposing techniques for reasoning about (and also 
verifying) the models (Palanque & Bastide, 1995). 

Formal description techniques are usually ap- 
plied to early phases in the development process 
(requirements analysis and elicitation) and clearly 
show their limits when it comes to evaluation (test- 
ing). 

When the emphasis is on validation, iterative 
design processes (Hix & Flartson, 1993) are gener- 
ally put forward with the support of prototyping as a 
critical tool (Rettig, 1994). Flowever, if used in a 
nonstructured way and without links to the classical 
phases of the development process, results pro- 
duced using such iterative processes are usually 
weak in terms of reliability. They can also be unac- 
ceptable when interfaces for safety-critical applica- 
tions are concerned. 

If we consider interfaces such as the ones devel- 
oped in the field of air traffic control (ATC), a new 
characteristic appears, which is the dynamics of 
interaction objects in terms of existence, reactivity, 
and interrelations (Jacob, 1999). In opposition to 



WIMP (windows, icons, menus, and pointing) inter- 
faces, in which the interaction space is predeter- 
mined, these interfaces may include new interactors 
(for instance, graphical representations of planes) at 
any time during the use of the application (Beaudouin- 
Lafon, 2000). Even though this kind of problem is 
easily mastered by programming languages, it is 
hard to tackle in terms of modeling. This is why 
classical description techniques must be improved in 
order to be able to describe in a complete way highly 
interactive applications. 

BACKGROUND 

Several approaches propose solutions for the recon- 
ciliation of the specification and the validation phases 
in the field of interactive applications, but these 
solutions are often incomplete according to three 
different viewpoints. 

• Interaction Style Viewpoint: Post-WIMP 
user interfaces are not yet widely developed. 
For this reason, most of the approaches (see, 
for instance, Flussey & Carrington, 1999) only 
deal with WIMP interfaces, that is, static inter- 
faces for which the set and the number of 
interactors is known beforehand. The behaviour 
and the role of these interactors are standardised 
(typically windows and buttons belong to this 
category). 

• Development Phase Viewpoint: We often 
find disparate solutions that do not integrate the 
various phases in a consistent manner (Martin, 
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Figure 1. Iterative development process with 
PetShop 




1999). So, most often, several gaps remain to 
be bridged manually by the teams involved in 
the development process. 

• Reliability of Results Viewpoint: Several 
integrated approaches have been proposed for 
WIMP-interactive applications. Among them, 
we find TRIDENT (Bodart, Hennebert, 
Leheureux, & Vanderdonckt, 1993), which is 
the more successful one as it handles both data 
and dialogue description and as it also incorpo- 
rates ergonomic evaluation by means of em- 
bedded ergonomic rules. However, specifica- 
tion techniques used in the project have not 
been provided with analysis techniques for 
verifying models and the consistency between 
models. 



PROTOTYPING CAN BE 
FORMAL, TOO 

The PetShop (Petri Nets Workshop) CASE (com- 
puter-aided software engineering) tool promotes an 
iterative development process articulated around 
the use of a formal description technique of the 
dialogue of the interactive application. 

This formal description technique (based on the 
petri nets) was developed at LIIHS in the early ’90s 
(Bastide & Palanque, 1990) and has been refined 
since then (Bastide & Palanque, 1999). The use of 
this kind of modeling technique provides extended 
benefits with respect to those less formal. Indeed, 
analysis tools, exploiting the mathematical back- 
ground of formalism, allow the validation of the 
application before its implementation. 



Figure 2. A menu opened on the radar for a 
selected plane 




A SAFETY-CRITICAL CASE STUDY 

The example presented in this article is extracted 
from a complex application studied in the context of 
the European project Mefisto (http:// 
giove.cnuce.cnr.it/mefisto.html). 

This project is dedicated to formal description 
techniques and focuses on the field of air traffic 
control. This example comes from an en route air 
traffic control application focusing on the impact of 
data-link technologies in the ATC field. Using such 
applications, air traffic controllers can direct pilots in 
a sector (a decomposition of the airspace). 

The radar image is shown in Figure 2. On the 
radar image, each plane is represented by a graphi- 
cal element providing air traffic controllers with 
useful information for handling air traffic in a sector. 

Figure 3 presents the general architecture of 
PetShop. The rectangles represent the functional 
modules of PetShop. The document-like shapes 
represent the models produced and used by the 
modules. 

PetShop features an object petri-net editor that 
allows for the editing and executing of the ObCSs 
(object control structures) of the classes. At run 
time, the designer can both interact with the speci- 
fication and the actual application. These are pre- 
sented in two different windows overlapping in 
Figure 4. The window PlaneManager corresponds 
to the execution of the window with the object petri 
net underneath. 
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Figure 3. Architecture of the PetShop environment 



Design time 



Comprehensive ICO description 




A well-known advantage of petri nets is their 
executability. This is highly beneficial to our ap- 
proach since as soon as a behavioural specification is 
provided in terms of ObCSs, this specification can be 
executed to provide additional insights on the possible 
evolutions of the system. 

Figure 4 shows the execution of the specification 
of the ATC application in PetShop. The ICO speci- 
fication is embedded at run time according to the 
interpreted execution of the ICO (see Bastide & 
Palanque, 1995, 1996, for more details about both 
data structures and execution algorithms). 

At run time, the user can look at both the specifi- 
cation and the actual application. They are in two 
different windows overlapping in Figure 4. The win- 
dow PlaneManager corresponds to the execution of 
the window with the object petri net underneath. 



Figure 4. Interactive prototyping with PetShop 




In this window, we can see the set of transitions 
that are currently enabled (represented in dark grey 
and the other ones in light grey). This is automati- 
cally calculated from the current marking of the 
object petri net. 

Each time the user acts on the PlaneManager, 
the event is passed onto the interpreter. If the 
corresponding transition is enabled, then the inter- 
preter fires it, performs its action (if any), changes 
the marking of the input and output places, and 
performs the rendering associated (if any). 



INTERACTIVE PROTOTYPING 

Within PetShop, prototyping from a specification is 
performed in an interactive way. At any time during 
the design process, it is possible to introduce modi- 
fications either to make the specification more 
precise or to change it. The advantage of model- 
based prototyping is that it allows designers to 
immediately evaluate the impact of a modification. 

We have identified two different kinds of modi- 
fications that can be performed using PetShop, 
namely lexical and syntactic. 



Lexical Modifications 

The lexical part of the user interface gathers el- 
ementary elements of the presentation (for in- 
stance, the drawing of a button) and all the elemen- 
tary actions offered to the user (such as clicking on 
a button) . Lexical modifications are concerned with 
the addition, removal, or modification of these kinds 
of elements. 

• Changing the rendering of a plane. When 
selected, the colour of a plane changes to 
green. As a lexical modification, we propose 
to change it to red. 

• At the specification level, nothing changes 
in the specification. Only the content of 
the method showSelected must be modi- 
fied, and this must be done using the 
JB uilder environment. 

• Changing the event triggering the selection of 
a plane. The currently used event is Left 
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Button Shift Click. We propose to use the 
event Left Button Click instead. Therefore, 
we need to perform the following modifica- 
tions. 

• At the specification level, the correspond- 
ing event must be changed in the activa- 
tion function. 

• At the code level, the Java code must be 
modified to change the adapter (repre- 
senting the activation function) of the wid- 
get plane. 

Syntactic Modifications 

The syntactic part of the user interface describes the 
links and relationships between the lexical elements 
(for instance, pressing shift and clicking on a plane, 
then right clicking on the plane to open the menu and 
delete the plane). 

• Modifying the selection mechanism. Currently, 
only one plane can be selected at a time. In 
order to allow multiple selections, the following 
modifications must be performed. 

• At the specification level, the inhibitor arc 
(the arc terminated by a black circle) 
linking the transition select to the place 
SelectedPlane (see Figure 4) must be re- 
moved. 

• At the code level, there is no modification. 

• Defining an upper limit for the number of 
planes in the sector. In the initial informal 
specification, there is no limit on the number of 
planes. Adding a maximum limit of 20 planes 
(number of planes normally controlled by a 
controller) requires the following modifications. 

• At the specification level, a new place 
must be added in the PlaneManager ObCS 
(Figure 4). Initially, this place will hold 20 
tokens. This place has to be connected by 
an arc to the transition NewPlane of the 
same ObCS. When a plane leaves a sector 
(or is deleted using the menu), the corre- 
sponding transition must add a new token 
to this place. 

• At the code level, there is no modification. 



FUTURE TRENDS 

This article has presented the use of PetShop as a 
CASE tool for the interactive prototyping of safety- 
critical software. We have shown how PetShop can 
deal with a specific kind of interactive system in 
which interactors can be dynamically instantiated, 
and thus the dialogue part of the system may consist 
in apotentially infinite number of states. This kind of 
application takes full advantage of the expressive 
power of the ICO’s formalism, which is based on 
high-level petri nets and thus is able to deal both with 
an infinite number of states and concurrent 
behaviours. 

Sophisticated interaction techniques such as the 
multimodal ones bring new challenges such as tem- 
poral constraints and the description of fusion mecha- 
nisms. ICOs have been extended to deal with 
multimodal interactive systems (Palanque & Schyn 
2003), and these extensions are currently under 
integration within the PetShop environment. 

CONCLUSION 

Prototyping is now recognized as a cornerstone of 
the successful construction of interactive systems 
as it allows making users the centre of the develop- 
ment process. Flowever, prototyping tends to pro- 
duce low-quality software as no specification or 
global design is undertaken. We have shown in this 
article how formal specification techniques can con- 
tribute to the development process of interactive 
systems through prototyping activities. 

While the ICO formal specification technique 
has reached a maturity level allowing coping with 
real size dynamic, interactive applications, the 
PetShop environment is currently made available on 
the Web (http://liihs.irit.fr/petshop/start/pubs.html). 
This CASE tool allows designers to build formal 
descriptions in a modeless and interactive way, thus 
allowing them to continuously assess their design 
with respect to the actual execution of their specifi- 
cation and the actual verification results from inte- 
grated analysis tools. A real size application has 
been completely specified in the field of the Euro- 
pean project Mefisto (http://giove.cnuce.cnr.it/ 
mefisto.html). 
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However, the work done on this air traffic control 
application has also shown the amount of work that 
is still required before the environment can be used 
by other people than the ones that took part in its 
development. 

This article has presented a CASE tool called 
PetShop, dedicated to the formal description of 
interactive systems. 

REFERENCES 

Bastide, R., & Palanque, P. (1990). Petri net ob- 
jects for the design, validation and prototyping 
of user-driver interfaces. Third IFIP Conference 
on Human-Computer Interaction: Interact’90, (pp. 
625-631). 

Bastide, R., & Palanque, P. (1995). A petri net 
based environment for the design of event-driven 
interfaces. Proceedings of the 16 th International 
Conference on Application and Theory of Petri 
Nets (ATPN’95) (pp. 153-167). 

Bastide, R., & Palanque, P. (1996). Implementation 
techniques for petri net based specifications of 
human-computer dialogues. Proceedings of the 2 nd 
Conference on Computer Aided Design of User 
Interfaces (CADUI’96) (pp. 153-167). 

Bastide, R., & Palanque, P. (1999). A visual and 
formal glue between application and interaction. 
Journal of Visual Language and Computing, 
10(3), 129-153. 

Bastide, R., Sy, O., Palanque, P., & Navarre, D. 
(2000). Formal specification of CORBA services: 
Experience and lessons learned. ACM Confer- 
ence on Object-Oriented Programming, Systems, 
Languages, and Applications (OOPSLA), (pp. 153- 
163). 

Beaudouin-Lafon, M. (2000). Instrumental inter- 
action: An interaction model for designing post- 
WIMP user interfaces. ACM CHI’2000 Confer- 
ence on Human Factors in Computing Systems, (pp. 
153-161). 

Bodart, F., Hennebert, A. M., Leheureux, J. M., & 
Vanderdonckt, J. (1993). Encapsulating knowledge 
for intelligent automatic interaction objects selec- 



tion. Human Factors in Computing Systems 
INTER CHI’ 93, 424-429. 

Hix, D., & Hartson, R. (1993). Developing user 
interfaces: Ensuring usability through product 
and process. New York: John Wiley & Sons. 

Hussey, A., & Carrington, D. (1999). Model-based 
design of user interfaces using object-Z. Proceed- 
ings of the 3 rd International Conference on Com- 
puter-Aided Design of User Interfaces (pp. 153- 
179). 

Jacob, R. (1999). A software model and specifica- 
tion language for non-WIMP user interfaces. ACM 
Transactions on Computer-Human Interaction, 
6( 1), 1-46. 

Martin, C. (1999). A method engineering frame- 
work for modelling and generating interactive 
applications. Third International Conference on 
Computer-Aided Design of User Interfaces, (pp. 
50-63). 

Palanque, P., & Bastide, R. (1995). Verification of 
an interactive software by analysis of its formal 
specification. Proceedings of Interact ’95 (pp. 191- 
196). 

Palanque, P., & Schyn, A. (2003). A model-based 
approach for engineering multimodal interactive sys- 
tems. Proceedings of INTERACT 2003, IFIP TC 
13 th Conference on Human Computer Interac- 
tion (pp. 543-550). 

Pfaff, G. (Ed.). (1985). User interface manage- 
ment systems: Eurographics seminar. Seeheim, 
Germany: Springer Verlag. 

Rettig, M. (1994). Prototyping for tiny fingers. Com- 
munications of the ACM, 37(4), 21-27. 

KEY TERMS 

ATC (Air Traffic Control): This acronym re- 
fers to both the activities and the systems involved in 
the management of flights by air-traffic controllers. 
The air traffic controllers’ main task is to ensure 
flight safety with an efficient, secure, and ordered 
air traffic flow. ATC systems are dedicated to the 
support of these tasks. 
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CASE (Computer-Aided Software Engineer- 
ing): This acronym refers to a set of tools dedicated 
to support various phases in the development pro- 
cess of software systems. Usually, they support 
modeling activities and the refinement of models 
toward implementation. 

Interactors: Elementary interactive components 
such as push buttons, text fields, list boxes, and so 
forth. 



ObCS (Object Control Structure): A 

behavioural description of objects and classes. 

PetShop (Petri Nets Workshop): A CASE 
tool dedicated to the formal design and specification 
of interactive safety-critical software. 

WIMP (Windows, Icons, Menus, and Point- 
ing): This is a classical interaction technique found 
in most window managers like Microsoft Windows. 
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INTRODUCTION 

With the increasing demand for global communication 
between countries, it is imperative that we understand 
the importance of national culture in human commu- 
nication on the World Wide Web (WWW). As we 
consider the vast array of differences in the way we 
think, behave, assign value, and interact with others, 
culture becomes a focal point in research of online 
communication. More than ever, culture has become 
an important human-computer interaction (HCI) is- 
sue, because it impacts both the substance and the 
vehicle of communication via communication tech- 
nologies. Global economics and information delivery 
is leading to even greater diversification among indi- 
viduals and groups of users who employ the WWW as 
a key resource for accessing information and pur- 
chasing products. Companies will depend more on the 
Internet as an integral component of their communi- 
cation infrastructure. With a shift toward online ser- 
vices for information, business professionals have 
identified international Web usability as an increas- 
ingly relevant area of HCI research. What must be 
addressed are the cultural factors surrounding Web 
site design. Specifically argued is that culture is a 
discernible variable in international Web site design, 
and as such, should better accommodate global users 
who seek to access online information or products. 
There are still many unresolved questions regarding 
cross-cultural HCI and communication and the deliv- 
ery of information via the Web. To date, there has 
been no significant connection made between culture 
context and cognition, cross-cultural Web design, and 
related issues of HCI. This correlation is relevant for 
identifying new knowledge in cross-cultural Web 
design theory and practice. 

BACKGROUND 

In order to maximize the easy access of online 
information and products, the building of Web sites 



should accommodate for more than multilingual com- 
munication (Sheridan, 2001). Muchrather, Web site 
developers responsible for design and testing, should 
have an equal concern for the complexity inherent in 
cultural diversity, that is, with factors such as social 
and psychology development. Past and recent cross- 
cultural studies and theoretical models have made 
direct links between culture, context, and related 
preferences (Chau, Cole, Massey, Montoya-Weiss, 
& O’Keefe, 2002; Hall, 1959, 1966; Hofstede, 1997; 
Trompenaars, 1997), but with a high emphasis placed 
especially on behavior. 

Nisbett and Norenzayan (2002) argued that most 
psychologists in the 20"’ century continue to hold 
erroneous assumptions about the relationship be- 
tween culture and cognition. This, they say, is fos- 
tered from theoretical positions in learning theory as 
seen by the work of Miller and J ohnson-Laird (1976), 
as well as other cognitive scientists who embrace 
Piaget’s position of extreme formalism and content 
independence of inferential rules found in culture. 
The problem with this view is that formalist theory 
assumes that cognitive processes are universal and 
all normal humans are equipped with the same set of 
attentional, memorial, learning, and inferential pro- 
cedures, regardless of the content they operate on. 
However, the landmark work of cultural psycholo- 
gist Nisbett (Nisbett, 2003; Nisbett & Norenzayan, 
2002; Nisbett, Peng, Choi, & Norenzayan, 2001) 
provides a significant rebuttal to these assumptions 
about the independent relationship between culture 
and cognition. He includes a theoretical model of 
significant depth on which to build support for a new 
theory for international Web design that addresses 
the complexity of cognition in cultural context. Nisbett 
and Norenzayan (2002) state that the idea that 
culture profoundly influences the contents of thought 
through shared knowledge structures has been a 
central theme in modern cognitive psychology. 

Nisbett’ s perspective on culture and cognition is 
derived from a range of studies and knowledge 
claims based on the earliest work of Vygotsky 
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(1979, 1989) and carried on by Luria(1971, 1976) in 
the 1960s-1970s. The central theory of the Russian 
School argues that cognitive processes emerge from 
practical activity that is culturally constrained and 
historically developing. Nisbett makes note of the 
significance of this early V ygotskian research (Luria, 
1971; Vygotsky, 1979) to promote the idea that 
culture fundamentally shapes thought. This latter 
claim has provided a theoretical model for cultural 
cognition theory (CCT), which goes to the root of 
human information processing and other complex 
cognitive systems that are affected by cultural con- 
text. Nisbett and Norenzayan (2002) present evi- 
dence concerning assumptions about universality 
and content independence, concluding that multiple 
studies support their view of the relationship be- 
tween culture and cognition, casting substantial doubt 
on the standard assumptions held by many psycholo- 
gists. At the same time, the vast majority of empiri- 
cal studies in cross-cultural psychology and cultural 
anthropology support the position that cognition is 
dependent upon cultural context, especially where 
formal education is present. 

MAIN FOCUS 

Over the last ten years, usability theory and testing 
have dominated the discussion among HCI and 
information technologists in academia and industry, 
setting the stage for culture to become the next 
frontier of Web design research (Dalai, Quible, & 
Wyatt, 2000; Eveland & Dunwoody, 2000; 
Fernandes, 1995; Kim & Allen, 2002; Marcus & 
Gould, 2000; Sears, Jacko, & Dubach, 2000; Wheeler, 
1998; Zahedi, Van Pelt, & Song, 2001; Zassoursky, 
1991). At present, many technologists neglect the 
impact of culture on communication, content deliv- 
ery, and information structure. For them, technology 
is often used in place of creative and research-based 
solutions for overcoming limitations to human com- 
munication. However, as the strategic planning of 
Web design has fallen into the hands of HCI design- 
ers, social scientists, and communication experts, 
technology has not been seen as the panacea to 
online communication issues between cultures. 
Rather, in the process of investigating the most 
appropriate ways to maximize online information 



delivery to international users, these specialists are 
exploring ways to confront an array of cultural 
contexts that are both vast and complex. 

In regard to HCI, there are multiple usability 
studies that address cross-cultural Web site design 
from a socio-behavioral perspective (Barber & Badre, 
1998; Dalai et al., 2000; Eveland & Dunwoody, 
2000; Fernandes, 1995; Honold, 2000; Kim & Allen, 
2002; Marcus, 2000, 2003; Marcus & Gould, 2000; 
Sears et al., 2000; Wheeler, 1998; Zahedi et al., 
2001). However, limited focus on the relationship 
between culture and cognition as a theoretical model 
has been adequately explored, especially when we 
consider user preferences that are culturally deter- 
mined by social-cognitive development. Hence, this 
article presents the view of cultural cognition theory 
(CCT) from the earliest work of Vygotsky (1979, 
1989) and Furia (1971, 1976). Their work was 
further developed through the contemporary re- 
search of Richard E. Nisbett (Nisbett et al., 2001, 
2002 ; Nisbett, Fong, Fehman, & Cheng, 1987; Nisbett 
& Ross, 1980) and colleagues. A relationship is 
drawn between CCT and cross-cultural Web devel- 
opment as a means to identify cognitive differences 
among designers that ultimately influence site design 
and user-Web interaction. The collective works 
outlined earlier identify propositions that argue how 
culture shapes cognitive phenomena, influencing the 
content of thought through shared knowledge, and 
subsequently learning, cognitive development, and 
processes that may impact Web design. 

One proposition purported by Nisbett and 
Norenzayan (2002) is that “cultures differ markedly 
in the sort of inferential procedures (cognitive pro- 
cesses) they typically use for a given problem” (p. 
2). To support this claim, they spend considerable 
time outlining a range of studies dealing with linguis- 
tics and mathematics that show cultural differences 
in basic knowledge structures and inferential proce- 
dures (Fucy, 1992; Furia, 1971; Miller, Smith, Zhu, 
&Zhang, 1995; Miller &Stigler, 1987;Wynn, 1990). 
Specifically, these studies show infinitely variable 
differences in knowledge domains, analytical pro- 
cesses, learning skills, and inferential procedures 
(such as deductive rules and schemes for induction 
and causal analysis), among diverse cultures. This is 
because these processes operate on different inputs 
for different people in different situations and cul- 
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tures. A summary of these studies showed repeated 
evidence to support the linguistic differences that 
affect thought. Nisbett and Norenzayan (2002) rec- 
ommend that more research is needed to examine the 
pervasiveness of the influence of language on thought. 

Hence, HCI research that focuses on social and 
cognitive science and cross-cultural communication 
methodologies may inform international Web site 
designers. A range of Web design components needs 
to be considered during the development stage of 
cross-cultural sites. Besides the explicit cultural dif- 
ferences of text, numbers, dates, symbol-sets, and 
time, more critical are the implicit and less formal 
dimensions of page format, imagery, colour, informa- 
tion architecture, and system functionality. Russo 
and Boor (1993) discuss the intuitive behaviour that 
influences the use of these elements. Studies focused 
on information technology also show considerable 
cultural differences in attitudes toward computers 
(Choong & Salvendy, 1998, 1999; Igbaria & Zviran, 
1996; Omar, 1992). However, Van Peurssen (1991) 
suggests that culture is a concept that is far too 
complex for mere description. In fact, variations in 
the implicit aspects of Web design may be too subtle 
to discern their cultural origin, and therefore demand 
a more rigorous investigation of their relationship to 
culture and cognition. 

Based on this theoretical foundation, future HCI 
research must identify the knowledge structures, 
logic, and analytical approaches that constitute a 
specific Web design orientation based on the cogni- 
tive structures influenced by cultural context. By 
observing the way Web designers organize online 
information, findings should show cultural orienta- 
tions in design styles, preferences, and strategic 
planning. 

FUTURE TRENDS 

New research opportunities in communication tech- 
nology are emerging worldwide that are working 
towards universal access for all international Web 
users. However, the rapid transition to online deliv- 
ery of information products has forced corporations 
to confront the polarizing effect of culture and com- 
munication as they attempt to build Web sites that 
cross the barriers of language and a broad range of 
more subtle cultural issues. To work through these 



challenges, technological trends and technologies 
will continue to address complex issues surrounding 
semantics and language delivery through the Web. 
Language technologies are showing promising ca- 
pabilities in addressing the challenges of unre- 
stricted cross-cultural communication. 

Research in the United States and Europe (Gast, 
2003; Sierra, Wooldridge, Sadeh, Conte, Klusch, & 
Treur, 2000; Wagner, Yezril, & Hassel, 2000) is 
being funded to support cooperatively built tech- 
nologies to better facilitate cross-cultural communi- 
cation. Cross-cultural sites may attempt to appeal 
to international users by using human-like anthropo- 
morphic agents that simulate their cultural profiles. 
This could include cultural-centric related attributes 
such as body language, gestures and facial expres- 
sions that mimic human characteristics. The inten- 
tion would be to adapt to the users ’ emotional world 
in order to enhance cognitive capabilities and inter- 
actions while using a site. Web site developers will 
increasingly take into account the full complexity of 
the human emotional apparatus, including the hu- 
man-like responses that users often seek while 
engaging interactive systems. 

Recent research (Burnett & Buerkle, 2004; 
Dou, Nielsen, & Tan, 2002; Faiola, 2002; Faiola & 
Matei, 2005; Hillier, 2003; Yetim & Raybourn, 
2003; Zahedi et al., 2001) continues to examine the 
influence of culture on Web design by comparing 
subjects from a range of diverse cultures. These 
studies provide computational models that show 
trends and comparisons of the data that can help to 
draw conclusions regarding the influence of cultural 
cognition on local developers and subjects who 
interacted with Web sites. Future studies must also 
address cultural differences in: ( 1 ) task times, navi- 
gational paths, interface design, and information 
architecture, and (2) user preferences for sites 
created by designers from their own cultures, that 
is, will users demonstrate bias toward color, design, 
and information structure by designers from their 
own culture? 



CONCLUSION 

From a cultural cognition perspective, we have 
addressed how cognitive processes are susceptible 
to cultural variation, adding that cultural orientation 
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has a direct impact on designers of Web sites. As a 
result, Web developers need to design online sites 
from an understanding of cultural cognitive theory 
and knowledge as part of their basic strategy. Future 
trends in CCT research may prove the assumption 
that cultural differences need to drive variations in 
Web site design and development. 

With increasing dependence on the World Wide 
Web (WWW) for international communication, the 
need for effective delivery of content will force Web 
developers to: (1) move away from homogeneous 
design models and routine time-on-task usability 
testing, (2) devise, design, and test models that can 
account for the complexity of online communication 
and information exchange between a diversity of 
national cultures, and (3) consider the significant 
influence of cultural context on cognition and cogni- 
tive development of Web site designers from a 
diversity of cultural orientations. 
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KEY TERMS 

Anthropomorphic: An attribution of human 
characteristic or behavior to natural phenomena or 
inanimate objects, that is, the embodiment of a 
graphic user interface agent can be anthropomor- 
phic since the representation may take on human 
attributes or qualities. Anthropomorphism is an at- 
tempt to design technologies to be user-friendly. 

Cognition: The mental processes of an indi- 
vidual, including internal thoughts, perceptions, un- 
derstanding, and reasoning. It includes the way we 



organize, store, and process information, as well as 
make sense of the environment. It can also include 
processes that involve knowledge and the act of 
knowing, and may be interpreted in a social or 
cultural sense to describe the development of knowl- 
edge. 

Cultural Cognition Theory: A theory that 
frames the concept that culture profoundly influ- 
ences the contents of thought through shared knowl- 
edge structures and ultimately impact the design and 
development of interactive systems, whether soft- 
ware or Web sites. 

Cultural Psychology: An interdisciplinary field 
within the social sciences that brings together gen- 
eral and cognitive psychology, cross-cultural com- 
munication, anthropology, as well as linguistics and 
philosophy. Cultural psychologists study how cul- 
tural context, meaning, practice, and established 
institutions might impact individual human psychol- 
ogy- 

Culture: A deposit of knowledge, experience, 
beliefs, values, attitudes, meanings, social hierar- 
chies, religion, notions of time, roles, spatial relation- 
ships, concepts of the universe, and material objects 
and possessions acquired by a group of people in the 
course of generations through individual and group 
development (Samovar & Porter, 2003). 

Usability: The effectiveness, efficiency, and 
satisfaction with which specified users achieve 
specified goals in particular environments. (From 
the International Organisation for Standardisation 
(ISO) code: ISO/IS 9294-11, http://www.iso.org/ 
iso/en/ISOOnline.openerpage.) 

Vygotskian: A general theory of cognitive de- 
velopment, developed by Vygotsky (1979, 1989) in 
the 1920s and 1930s in Russia, suggests that: (1) 
social interaction plays a fundamental role in the 
development of cognition, and (2) consciousness is 
the end product of socialization, for example, that 
cognitive development depends upon the zone (con- 
textual) of proximal development. 
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INTRODUCTION 

There are different ways in which we have used the 
concept of attention with regard to human informa- 
tion processing and behavior (cf Kahneman, 1973). 
Attention could be taken to mean whatever one is 
thinking about, as when a student is lost in the 
thoughts of daydreaming rather than paying atten- 
tion to the teacher’s lesson. Attention can also be 
associated with where we are looking or that for 
which we are looking (cf Moray, 1969), as when a 
flashing Web advertisement takes your attention or 
when one is mentally focused on searching through 
a Web page to find information. 

This attention switching or attention movement 
perspective on attention (cf Broadbent, 1957) is of 
most interest in this article. A flashing Web banner 
advertisement could, by design, take our attention 
from where we had intended to focus, or a Web page 
could be designed such that it draws our interest and 
leads us to seek further information. If a person is 
looking in the wrong place to find what he or she 
wants, then it would be good for us to know about 
this. This article will review some theories of atten- 
tion that are relevant to understanding how human 
attention processing mechanisms work with regard 
to these issues, and will review the basics of a 
method that can be used to track attention movement 
by tracking mouse movements in a browser. This 
method has grounding in well-established theory, 
and it can be used in a laboratory or can be used 
remotely with data saved to a server for replay. 

BACKGROUND 

Serial, Parallel, and Hardwired Systems 
of Attention 

A little over a century ago, interest in the idea of 
attention emerged as researchers began studying 



various mechanisms that might affect human mental 
processing limitations (e.g., Bryan & Harter, 1899; 
Jastrow, 1892; Solomons & Stein, 1896; Welch, 
1898). Psychologists lost interest in this line of 
research to the study of behaviorism for several 
decades, but renewed interest emerged again in the 
1950s (e.g., Adiseshiah, 1957; Bahrick, Noble, & 
Fitts, 1954; Broadbent, 1957; Garvey & Knowles, 
1954). Throughout the period of the 1950s through 
the 1970s, researchers were in part attempting to 
understand why and how processing limitations occur. 

Single-Channel Hypothesis 

One early view in this rebirth was the single- 
channel hypothesis, which viewed the processing 
system as something like a single-channel, serial 
transmission line (Welford, 1967). In an attempt to 
locate the bottleneck in this communication channel, 
Broadbent (e.g., 1957) proposed that there is a 
many-to-one selection switch in the channel. It is 
difficult, for example, to comprehend multiple con- 
versations at a time even though we can understand 
one conversation out of many and can switch our 
attention to another. The single-channel hypothesis, 
however, was not able to explain the observation 
that people can in other kinds of situations appar- 
ently process multiple tasks concurrently. We can, 
for example, comprehend only one conversation out 
of many, yet can concurrently drive an automobile 
while listening. 

Undifferentiated-Capacity Hypothesis 

Moray (1967) proposed that some of the problems 
with the single-channel hypothesis could be ex- 
plained by a flexible central processor of limited 
capacity. Popularized by Kahneman (1973) and 
labeled the undifferentiated-capacity hypothesis 
by Kerr (1973), this model viewed the processing 
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system as possessing a very general pool of re- 
sources that can be allocated to the performance of 
various concurrent tasks. This model attempts to 
explain how limitations to process a particular task 
will change depending on what other processing 
tasks might also compete for resources from the 
central processor. For example, some of us can talk 
while typing, but our typing speed and accuracy 
often suffers when doing so. Neither of these two 
models was viewed by Kahneman as adequate 
alone; Kahneman viewed the single-channel idea as 
associated with processes that have structural limi- 
tations. Our visual system, for example, can only 
point at and process one single view at a time. 

Multiple-Resource Theory 

The undifferentiated-capacity hypothesis is also not 
completely adequate. Researchers found, for ex- 
ample, that it is easier to attend to auditory and visual 
messages concurrently than to two concurrent audio 
messages (Rollins & Flendricks, 1980; Triesman & 
Davies, 1973). This could be due in part to the 
existence of more than one flexible processor oper- 
ating in parallel, for example, one limited-capacity 
processor for visual messages and one limited- 
capacity processor for auditory ones, both operating 
in parallel and feeding into a flexible limited-capacity 
central processor. Friedman, Poison, and Dafoe 
(1988) found that there are differences in processing 
degradation between tasks processed in each cere- 
bral hemisphere and a common second (concur- 
rently performed) task, further suggesting evidence 
of multiple capacity- or resource-limited processors. 

Automatism and Skilled Processing 

A problem with the capacity explanations is that 
processing can sometimes appear to be resource 
free, or to consume from a processor that has no 
apparent bottlenecks or resource limitations. Early 
researchers such as Bryan and Harter (1899) were 
finding that practice could lead to the automatization 
of task performance, or skill acquisition. The early 
dual-task studies were finding that when two tasks 
are performed concurrently, they tend to interfere 
with each other less and less with continued prac- 
tice. It appears that with practice, some processes 
become hardwired outside of the control of the 



flexible processing systems, and so the person can 
effortlessly do these automatic processes in parallel 
with the controlled or effortful processes that re- 
quire the use of the flexible general-purpose proces- 
sor (cf Shiffrin & Schneider, 1977). 

The discussion above suggests that there are at 
least three general mechanisms involved in how 
people process information. 

1. System components composed of a flexible, 
general-purpose central processor and other 
more specialized, but flexible, processors. These 
resources can process different tasks concur- 
rently or in parallel. 

2. Serial system components and structurally lim- 
ited components that must be switched from 
one task to another. Eyes can only be pointed 
in one direction at a time and must be physically 
moved if we want to pay attention to something 
else. Ears can receive many conversations at 
once, but the preprocessor associated with 
them can only process a single conversation at 
a time. 

3. Hardwired system components that do not 
consume the resources of these flexible paral- 
lel and serial processing components. Pro- 
cesses become hardwired through practice. 
Learning to ride a bicycle, for example, re- 
quires all of a child’s attention at first, and the 
slightest distraction can cause the child to fall. 
With practice, however, the child will be able to 
ride effortlessly, concurrently carrying on a 
conversation or thinking about something else. 

Voluntary Attention 

The notion that we have a flexible central processor 
or a set of processors and can choose where to focus 
our thinking is associated with what is called volun- 
tary attention (e.g., Hunt & Kingstone, 2003; James, 
1899). A student may choose to daydream rather 
than listen to the teacher: Both tasks can be per- 
formed concurrently, but the student consciously 
and deliberately allocates most attentional resources 
toward thinking about something while allocating 
some resources to listen just enough to pick out 
anything important that should be written in the 
notebook. An online shopper consciously and delib- 
erately chooses to use attentional resources toward 
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seeking out the best deal on a new computer rather 
than to attend to e-mail or an online forum discussion 
with friends. 

Selective Attention 

Related to voluntary attention is selective attention, 
whereby we can choose what to process when 
presented with choices (e.g., Camels, Berthoumieux, 
& d’Arripe-Lontueville, 2004; Moray, 1969). An 
automobile driver may choose to look at scenery 
outside of the car window rather than to focus on 
traffic ahead. An online shopper can choose to 
physically shift the eyes to one brand’ s ad rather than 
the ads for other brands on a single Web page. Some 
of this might be due to allocating resources to differ- 
ent tasks within a capacity-limited processor, and 
some of this might be due to switching our attention 
between serial or structurally limited system re- 
sources. 

Involuntary Attention 

In some cases, our attention is taken without desire 
on our part. Involuntary attention can occur when 
a stimulus is, say, novel and unusual, surprising, or 
highly contrasted with the background (e.g., Berlyne, 
1960; Berti & Schroger, 2003). Web advertisers 
attempt to do this with flashing banner ads and pop- 
up windows. It is possible to become habituated to a 
stimulus, however, and to no longer notice it; flashing 
banner ads, for example, might be less effective as a 
user experiences them more and more. It is possible 
that with practice, some tasks become automated in 
helping us to ignore some stimuli, as when a person 
automatically clicks off a pop-up banner ad without 
much conscious thought. 

METHODS OF TRACKING 
ATTENTION WITH A BROWSER 
MOUSE 

Many methods and devices have been devised for the 
measurement and tracking of attention, but the focus 
of the rest of this article is on how we might be able 
to track attention by using a mouse and a Web 



browser. With a mouse, JavaScript, and a Web 
browser, we can detect the following: 

• “Click” events (mouse-button presses), as 
when a person does a click-through on a 
banner ad. We know where a person was 
looking when the mouse button was pressed. 

• X- and y-coordinates of the mouse pointer in 
the browser window. We can collect the posi- 
tion of the mouse at any point in time, and so 
by periodically sampling position and time in- 
formation, we can track mouse movements 
within a screen. In this way, we can know 
where a person was considering a click, or we 
can instruct the person to move the pointer 
continuously to show us where he or she was 
looking. 

• “Mouse-over” events, as when the mouse 
pointer is moved over or through an item on a 
menu list, over a button in a button list, or over 
an image in a Web page. If we know the 
location of such items in the page, we can 
collect the mouse-over information along with 
the time associated with the mouse-over event, 
thereby allowing us to track mouse move- 
ments within a screen. 

Collecting Data to Track Attention 

In all the cases above, we are ultimately collecting 
a measure of the screen position of the mouse 
pointer, and in all cases we could additionally take 
a measure of clock time. With an indication of time, 
we then have real-time data that can be used to 
remotely replay where a user’s mouse pointer was 
positioned during a visit to a Web site or within a 
single page. An assumption is that if the user clicks 
on something, he or she presumably has an interest 
in that thing and is therefore paying attention to it. 
By watching where a mouse pointer moves from 
and to, and where clicks are made we could track 
a person’s attention. 

JavaScript, a programming language that is 
included with the mainstream Web browsers, can 
be asked to perform a programmed routine when- 
ever some particular user action occurs. For ex- 
ample, clicking the mouse button when the pointer 
is over a particular hyperlink can send the user to 
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another page, open a pop-lip window, or (of interest 
here) save data associated with the circumstances 
of that user action. If we are attempting to track a 
person’s attention in real time, then we want to save 
mouse position and mouse clicks along with time 
information (Owen, 2002, discusses some issues 
with time-measurement resolution and with saving 
data). 

X- and Y-Coordinates 

Using JavaScript, it is possible to capture the values 
of the pointer’s x- and y-coordinates when the 
mouse is moving (an author’s example is at http:// 
mousEye.SyKronix.net). If the mouse moves, then 
the x- and y-coordinates of the pointer are captured 
and saved along with the clock time. This makes it 
possible to track movements of the mouse in real 
time, for which differences between the clock time 
on each capture suggest the speed of movement. If 
the user is instructed to move the mouse pointer to 
indicate where he or she is looking, then we could 
capture data that tells us where the person was 
looking in real time. There are two concerns with this 
method, however. One is that a very high amount of 
data would be collected when using this method. 
Another is that the user might not actually move the 
mouse pointer in tune with where he or she is 
looking. An alternative method that solves these 
problems is presented next. 

Mouse-Over Events 

A simpler and more powerful alternative to detect- 
ing where the user is focusing attention is to use 
JavaScript to detect whenever the mouse pointer 
moves over some particular position on the screen. 
JavaScript can be used to detect whenever the user 
moves the mouse pointer over a hyperlinked portion 
of text or graphic image. This will trigger an event in 
the program that causes the program to save which 
link or image the mouse pointer moved over. This is 
an improvement over the method of saving the x- and 
y-coordinates because the program function is only 
called whenever the user rolls over the target area of 
interest, not simply whenever the mouse is moved. 
We might only be interested, for example, in a user’ s 
movement between four different quadrants of the 



screen. Detecting when the mouse pointer moves 
out of one quadrant into another leaves us with a 
much simpler task of data collection and substan- 
tially less data to send to the server, to store, and to 
later analyze. 

The most important advantage of this method, 
however, is that the program function that is trig- 
gered by the mouse-over event allows us to change 
the image. This is a crucially important advantage in 
a study that wants to track where the user is looking. 
If the image changes, then we can mimic the way 
that people really do look at a Web page, with the 
center of vision, the area of concentration, being 
more clear, and the peripheral areas being less clear. 
The author has used this method in several ways: 

• No change in the image: The movement of 
the mouse pointer over various graphic-image 
fields merely sends back information regarding 
where the pointer was positioned in real time. 
The user is instructed to move the mouse 
pointer wherever he or she is looking. For 
example, if the user is participating in a Web 
usability study, we might ask the user to find 
something within the W eb site (an example is at 
http://mousEye.SyKronix.net/demos/ 
web.html). The user might move the pointer 
over a button on the left column, but then 
decide to move the pointer over an index item 
along the bottom and click. In this way, we did 
not simply collect information about what was 
ultimately clicked, but collected information 
about the hesitancy in selecting another choice. 
If time information is collected with the trigger 
of these events, then we would also have 
information regarding the amount of time that 
was associated with the hesitancy in consider- 
ing one choice before moving to make another. 
Such data can be collected remotely, sent to a 
server, and played back in real time later as if 
the researcher is actually standing over the 
shoulder of the user. 

• Change in focus: The entire page of, say, a 
one-page advertisement is presented some- 
what out of focus. Moving the pointer over the 
place where the user is looking makes that part 
more clear. In this way, the user is forced to 
reveal where he or she is looking (an example 



618 



Tracking Attention through Browser Mouse Tracking 



is at http://mousEye.SyKronix.net/demos/ 
drive.html). Again, this data can be sent to a 
remote server for storage, and can be played 
back later (in the style of a movie) in real time. 
• Change in contrast: The entire page of, say, 
a one-page advertisement is washed of color 
and low in contrast. Moving the pointer over 
the place where the user is looking makes that 
part brighten in color and contrast (an example 
is at http://mousEye.SyKronix.net/demos/ 
phren.html). Again, this forces the user to 
reveal where he or she is looking, and we can 
play this back remotely. 

FUTURE TRENDS 

Eye tracking (see Porta, 2002, for a review of 
methods) can be used in such studies as reading 
(e.g., Stewart, Pickering, & Sturt, 2004) and Web 
usability (e.g., Karn, Ellis, & Juliano, 2000; Schiessl, 
Duda, Tholke, & Fischer, 2003) . The adoption of eye 
tracking equipment for practical applications, how- 
ever, does not appear to be widely accepted. This is 
possibly in part due to the cost and complexity of the 
equipment, the intrusiveness of the equipment in 
market or usability testing, and the lack of accep- 
tance of such methods in real-world marketing in- 
dustries that are more comfortable with focus-group 
methods and such. 

The use of mouse tracking as a substitute for 
eye-tracking equipment could be an answer to some 
of these issues. The author developed these methods 
in 1997 and adapted them for a marketing research 
agency in 2000, but salespeople believed that clients 
were too comfortable with focus group methods. 
The mouse tracking method obtained reliable results 
in testing, but development was abandoned for lack 
of sellability to clients (nonproprietary examples 
were posted at http://mousEye.SyKronix.net). A 
few others have since reported experiments with 
mouse-tracking methods, suggesting that there is 
hope to see greater use in the future. Mueller and 
Lockerd (2001) describe a use that appears to use 
the x- and y-coordinates method. Tarasewich and 
Fillion (2004) and Ullrich, Wallach, and Melis (2003) 
describe methods that change the focus of the area 
under the mouse pointer. 



CONCLUSION 

Mouse tracking, which has reasonable grounding in 
attention theory, is advocated as a means to track 
user attention remotely as well as in a lab. Unlike eye 
tracking equipment, it is low in cost, relatively unob- 
trusive, and can be done remotely in a Web user’s 
own natural environment as well as in a laboratory 
setting. U sing the mouse-over method that allows us 
to change the focus and contrast of objects in the 
periphery of vision, it is possible to encourage the 
user to move a mouse pointer to indicate the location 
of his or her attention. The notion of automatism 
suggests that with a little practice, this would not 
interfere with the user’s task of browsing through a 
Web page or site. In this way, we could study 
voluntary attention, selective attention, and involun- 
tary attention in studies of Web usability and Web 
advertising. 
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KEY TERMS 

Attention: Mental processing that consumes 
our conscious thinking. Through a variety of mecha- 
nisms, there is a limit to the amount of processing 
that can take place in our consciousness. 

Automatism: An attention mechanism estab- 
lished through practice whereby the performance of 
a task apparently no longer interferes with the 
concurrent performance of other tasks. 

Habituation: The suppression of a response to 
or attention to a stimulus after repeated exposures. 

Involuntary Attention: The idea that some- 
thing can take a person’s attention by being novel, 
contrasting, startling, and so forth. 

Multiple-Resource Theory: A model of at- 
tention that assumes many specialized preproces- 



sors (e.g., visual system, auditory system) that can 
function in parallel. 

Selective Attention: The idea that a person 
can actively choose to attend to one of multiple 
stimuli that are present while ignoring others. 

Single-Channel Hypothesis: A model of at- 
tention that assumes that some mechanisms can 
process only one task at a time in a serial fashion. 
Some mechanisms have structural limitations (e.g., 
eyes can only point to one place). Multiple tasks are 
processed within some time frame by attention 
switching between tasks. 

Undifferentiated-Capacity Hypothesis: A 

model of attention that assumes a flexible, multipur- 
pose central processor that can process multiple 
tasks concurrently. This processor has a limited 
amount of resources, however, that can be allocated 
across all tasks. 

Voluntary Attention: The idea that a person 
can actively seek out information or things to think 
about. 
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INTRODUCTION 

Human interactions with computers are often via 
menus, and “in order to make information retrieval 
more efficient, it is necessary that indexes, menus 
and links be carefully designed” (Zaphris, 
Shneiderman, & Norman, 2002, p. 201). There are a 
number of alternatives to menus, such as icons, 
question-and-answer formats, and dynamic lists, but 
most graphical user interfaces are almost entirely 
menu-driven (Hall & Bescos, 1995). Menu systems 
have many advantages. For example, Norman (1991) 
identified low memory load, ease of learning and use, 
and reduced error rates as advantages of menu- 
driven interfaces. They frequently form the main 
part of a WYSIWYG (What You See Is What You 
Get) interface, providing most of the functionality in 
the more common operating systems such as 
Microsoft Windows. Consequently, familiarity also 
can be added to the list of advantages of using menus 
when accessing computer systems. These aspects 
are particularly important when considering public- 
access technologies, where individuals from across 
the population exhibiting a range of ages, skills, and 
experience levels will attempt to use the systems. 
Further, training or the opportunities for training will 
be minimal and, most likely, non-existent. 

BACKGROUND 

Two main types of menu designs are commonly 
found: traditional and pull-down. Traditional menus 
occupy the whole screen. Secondary and further 
menu levels also appear and, again, take up the 
whole screen. Once the final option choice has been 
taken, the screen is cleared for work. This type of 
menu is common in public access technologies such 



as cash points and multimedia information kiosks. 
Traditional menus are thought to be easier for first- 
time/novice users, because they are explicit in terms 
of operation. This is in contrast to pull-down menus, 
where operation is usually via a mouse or the enter 
and cursor keys. Pull-down menus have an initial 
main menu in the form of a bar at the top of the 
screen from which further lists of options may be 
seen and selected, thus leaving the majority of the 
remaining screen area for other purposes. This 
comprises their primary advantage: the user is able 
to stay in the same workspace/screen. However, 
this form of cascading menu hides information until 
the user activates the menu item, which can be 
viewed as a disadvantage (Walker, 2000). Pull- 
down menus form the main method for option selec- 
tion in the most commonly available packages from 
Microsoft and Macintosh. There are a number of 
variations of pull-down menus. For example, hori- 
zontal and vertical menus (Dong & Salvendy, 1999) 
and split and folded menus (Straub, 2004). Split 
menus present frequently accessed items at the top 
of the menu, while folded menus give the high 
frequency items first and on their own. After a short 
delay, the complete menu appears. 

The comparison of traditional and pull-down menu 
types has been a somewhat neglected area, with 
much work focusing on the comparison of menus 
with other styles of interface, such as command 
languages (Mahach, Boehm-Davis & Holt, 1995). 
As a further example, Benbasat and Todd (1993) 
compared icons with text and direct manipulation 
with menus. Direct manipulation was defined in this 
context as the “physical manipulation of a system of 
interrelated objects which are analogous to objects 
found in the real world” (Benbasat & Todd, 1993, p. 
375). These objects are usually represented as 
icons. Benbasat and Todd (1993) found no differ- 
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ence between the use of icons and text, and a speed 
advantage of direct manipulation over menus. This 
advantage, however, was diminished when the task 
was repeated for a third time, indicating that there 
may be a learning effect occurring in menu interac- 
tions. However, studies such as this do not serve to 
indicate the basic type of menu layout that is most 
beneficial to the user. 

Given the importance of navigation in computer- 
based tasks, many studies have been carried out on 
menu design. For example, Yu and Roh (2002) 
investigated the effects of searching using a simple 
menu with a hierarchal structure, a global and local 
navigation menu, and apull-down menu. They found 
search speeds differed significantly, with the pull- 
down menu being faster than the other two. 

Carey, Mizzi, and Lindstrom (1996) compared 
traditional and pull-down menu formats and found 
that experienced users completed menu search tasks 
faster than novice users, regardless of the menu 
style used, although there was no significant differ- 
ence between the two user groups in the number and 
type of errors made. The traditional menus elicited 
fewer errors than the pull-down menus for both 
experienced and novice users, but there was no time 
difference for task completion between the two 
menu types. Carey et al. (1996) also found that users 
preferred the traditional style menu, with this prefer- 
ence being stronger for novices than for experi- 
enced users. They suggested the fact that using a 
cash point application may have skewed the results 
in favor of the traditional menus due to a familiarity 
effect. A further bias in favor of the traditional menu 
condition lies in the fact that it required fewer key 
presses per transaction than the pull-down menus. 
This is an intrinsic feature of the two menu de- 
signs — the pull-down system by definition will re- 
quire an additional action at the start to open the 
menu from the top of the screen, while the traditional 
menu would already be occupying the majority of the 
screen. 

Bernard and Hamblin (2003) compared cascad- 
ing menus in horizontal and vertical forms with a 
categorical indexed menu design. Although the ter- 
minology is different, the categorical indexed menu 
appears to be equivalent to the traditional menu, and 
the cascading menus seem to be pull-down menus. 
They found that searching was faster using the 



indexed menu than the cascading, pull-down menus. 
Their results indicated that using a categorical menu 
would be 4.27 minutes quicker when accessing 40 
pages on the Internet. (This figure was derived from 
Nielsen [2003], who suggested that a user accesses 
40 pages of information in a typical surf of the 
Internet.) Bernard and Hamblin (2003) also found 
that the indexed menu was preferred by participants 
who chose this design more as a first choice over the 
two cascading menu designs. 

In a study we conducted comparing traditional 
and pull-down menus with older and younger adults, 
time differences between the menu types were 
found for both age groups, with traditional menus 
eliciting shorter times than pull-down menus. Carey 
et al. (1996) found that traditional menus elicited 
fewer errors than pull-down menus and found no 
evidence for their hypothesis that experienced users 
would commit fewer errors than novices. The differ- 
ence in error rates between the menu types was 
replicated in our work, although the effect was only 
present for the older age group. 

In terms of participant opinions about the two 
menu types, younger respondents expressed a pref- 
erence for pull-down menus; older adults preferred 
traditional menus. Both menu styles were shown to 
be easy to use by both age groups. There was one 
significant difference — young participants found the 
traditional menus hard to search by trial and error 
compared to their ratings for pull-down menus and to 
the older adults’ ratings of both menu types. This 
may have been because younger participants are 
more familiar with pull-down menus. However, this 
finding is not supported by Bernard and Hamblin 
(2003). Their participants were relatively young 
with a mean age of 32.6 years (SD = 8.2) but 
indicated a preference in use for the indexed menu. 

FUTURE TRENDS 

These experimental studies have demonstrated a 
number of points. First, older adults were more 
disadvantaged in their use of pull-down menus com- 
pared to traditional menus, relative to younger adults. 
This was true of the time taken to complete the task, 
the number of errors, and the steps required. Sec- 
ondly, the type of searching used by participants in 
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searching the two different menu structures was the 
same — most searching was carried out using seman- 
tic knowledge. This was possible because of the 
strong semantic consistency within the menus in this 
experiment. Finally, older users expressed a prefer- 
ence for traditional menus, and younger users pre- 
ferred pull-down menus. This may be due to a 
familiarity effect of pull-down menus amongst 
younger, experienced computer users. As stated in 
the introduction, pull-down menus are used more 
commonly in PC and Macintosh software, meaning 
that people who use computers regularly will be more 
accustomed to them than traditional menus. These 
findings, therefore, have implications for the future 
design of systems, as more people become familiar 
with pull-down menus, and thus, the age effect 
associated with traditional menus will start to dimin- 
ish. 

Looking to the future, an adaptive menu system 
that responds to the needs of a particular user may 
prove useful. Lee and Yoon (2004) pointed out that 
as menus become longer, it is more difficult for users 
to locate specific items. Flowever, some items will be 
selected more frequently than others, and systems 
could be designed to recognize this. Public access 
technologies could utilize this feature and adjust the 
order of presentation of menu items so that more 
frequently accessed items were presented near the 
top of menus. 

CONCLUSION 

When deciding between traditional and pull-down 
menu styles in an application, there are other factors 
to take into account. For some applications, only one 
menu style may be suitable for practical reasons. 
Although there does seem to be an advantage to 
traditional menus in terms of speed of use and 
reduced error rates, in particular for older and less 
experienced computer users, this type of menu takes 
up much more screen space than pull-down styles. In 
many interfaces, this will be impractical, particularly 
when a lot of information must be available on the 
screen. However, if everything else is equivalent, it 
is suggested that traditional menus be used in inter- 
face design, especially when the user group com- 
prises older adults and/or novice computer users. 
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KEY TERMS 

Graphical User Interface: Commonly abbrevi- 
ated GUI, this type of interface is a standardized 
way of presenting information and opportunities for 
interaction on the computer screen using the graphi- 
cal capabilities of modern computers. GUIs use 
windows, icons, standard menu formats, and so 
forth, and most are used with a mouse as well as a 
keyboard. 

Icons: Icons are a principal feature of GUIs and 
are small pictures representing objects, files, pro- 
grams, users, and so forth. Clicking on the icon will 
open whatever is being represented. 

Menu: A list of commands or options that ap- 
pear on the computer screen and from which the 
user makes a selection. Most software applications 



now have a menu-driven component in contrast to a 
command-driven system; that is, where explicit com- 
mands must be entered as opposed to selecting items 
from a menu. 

Mouse: In computer terms, this is a hand-oper- 
ated electronic device that moves the cursor on a 
computer screen. A mouse is essentially an upside- 
down trackball, although the former needs more 
room during operation as it moves around a horizon- 
tal surface. 

Public Access Technologies: These are com- 
puter-based technologies that are designed to be 
used by the general public. This has implications for 
their design, since they will be used by a diverse and 
unknown user group drawn from the human popula- 
tion. 

Pull-Down Menu: When the user points at a 
word with either a keystroke or a mouse, a full menu 
appears (i.e., is pulled down, usually from the top of 
the display screen), and the user then can select the 
required option. A cascading menu (i.e., a submenu) 
may open when you select a choice from the pull- 
down menu. 

Traditional Menu: This type of menu is essen- 
tially a series of display screens that appear sequen- 
tially as the user responds to the requests detailed on 
each screen. 

WIMP: This acronym stands for Windows, Icons, 
Mouse, Pointing device, and is a form of GUI. 

WYSIWYG: This acronym stands for What You 
See Is What You Get; it is pronounced Wiz-zee-wig. 
A WYSIWYG application is one where you see on 
the screen exactly what will appear on the printed 
document (i.e., text, graphics, and colors will show 
a one-to-one correspondence). It is particularly popu- 
lar for desktop publishing. 



625 



626 



Turning the Usability Fraternity into a 
Thriving Industry 

Pradeep Henry 

Cognizant Technology Solutions, India 



INTRODUCTION 

Many business and IT executives today think that 
usability is an important aspect of software appli- 
cations that are used in enterprises (Orenstein, 
1999). However, the term usability represents dif- 
ferent things to different people. And, to most people, 
usability does not sound like an aspect that could 
really impact enterprise performance and bottom- 
line. 

Literature suggests that the usability fraternity 
has failed to make an impact so far. For example, 
Bias and Mayhew (1994) ask "... given that the 
Human Factors Society (now the Human Factors 
and Ergonomics Society) is a quarter of a century 
old, why is it taking so long for usability engineering 
to achieve its place alongside the other accepted 
disciplines?” 

Later, this article looks at some reasons why, and 
what to do about it. 



BACKGROUND 

There are thousands of advertising agencies in the 
world and many of them have a large staff and huge 
revenues. Advertising is a recognized industry. Is 
usability a recognized industry? How many usabil- 
ity firms are there? How many usability firms have 
over 100 people or 10 million dollar revenues? How 
many are listed in the stock market? 

One U.S. -based organization says that though 
their usability engineering group strength of 1 8 spe- 
cialists is small, this number is still larger than what 
many independent usability groups have. That gives 
us an idea of the average size of usability firms. 

What are some of the problems that are stopping 
this field from growing big? Here are some: Practi- 
tioners are not picking up the right skills. Practitio- 
ners are not doing the right “usability” things. Prac- 



titioners are not impacting the business world. And 
practitioners are not promoting the right things. Of 
course, there are exceptions, but they are few. The 
following sections look at each problem in detail. 

DEVELOP GOOD CREDENTIALS 

Many usability practitioners are believed to not have 
the right kind of training. Shneiderman, Tremaine, 
Card, Norman, and Waldrop (2002) say that CHI 
(computer-human interaction) fails because its prac- 
titioners are badly trained. And Mauro (n.d.) says: 
“This important new science ( usability engineer- 
ing) has in many instances been dramatically mis- 
represented by pseudo-practitioners, who claim 
to have such expertise but often do not. As a result, 
many corporations and government agencies that 
retained such experts often found the experience 
unsatisfying and the promises of creating signifi- 
cantly more usable products and services illusive.” 

What is the education or skill-set that usability 
practitioners bring to their profession? Well, some 
bring expertise limited to the human side of users. 
Some others bring visual design or graphic tools 
expertise. Sure, those skills are required, but they 
are not enough. Practitioners need to be well-trained 
in technology and business. These are often the 
missing skills. 

Being technology-literate is important for practi- 
tioners. Technology-literate would mean having a 
degree in computer science or software engineer- 
ing. Technology-literate practitioners will know if 
their design can be implemented using the chosen 
application development software. They will know 
the technical impact of the design solutions they 
come up with (say, on system performance). When 
they speak the language of developers, they will also 
be trusted by those professionals, who will imple- 
ment the design solutions. 
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Being business-literate is important for practitio- 
ners. Business-literate would mean knowing how 
enterprises in various industries (banking, insurance, 
retail, etc.) perform their business functions. Also, 
since business-literate practitioners understand the 
business reasons why an enterprise invests in an 
application, they will know the impact of user- 
performance (or usability) on the enterprise rather 
than on the users alone. 



FOCUS ON DESIGN, NOT TESTING 

Literature — from the earliest to the most recent — 
has always recommended conducting usability tests, 
unfortunately with little or no emphasis on design. 
Authors have included well-known usability gurus. 
No wonder, most usability practitioners appear to be 
focused on testing. Testing, of course, is a useful 
technique to discover certain types of design prob- 
lems. However, the point is that a test-fix-test-fix 
kind of approach is not going to result in a highly 
usable user interface. This argument could be best 
appreciated by imagining the approach of usability- 
testing a building that was not designed correctly in 
the first place. Shneiderman et al. (2002) call this 
orientation to evaluation “The first human factors 
limitation.” 

Shneiderman et al. (2002) say, “... we do not 
contribute anything of substance: we are critics, able 
to say what is wrong, unable to move the product line 
forward.” He goes on to say that usability practitio- 
ners must become designers. Yes, practitioners 
should apply strong design skills using a strong 
design-driven process (Henry, 2003) that preferably 
has testing as one of the evaluation methods. 

MAKE HIGH-IMPACT 
CONTRIBUTIONS 

Here are a few reasons why low-impact user 
interfaces are rampant. 

• Most usability practitioners often fight only for 
screen-level improvements to user interfaces. 
Such improvements do not make a significant 
impact on the performance or bottom-line of 
the enterprise using the application. On the 



other hand, an improvement in the structure of 
the application’s navigation is likely to make a 
significant improvement in the user’s perfor- 
mance thereby impacting things like enter- 
prise workforce productivity (Henry, 2003). 

• One candidate the author evaluated for recruit- 
ment into his usability group had an MS degree 
in Human Factors and three years of HCI 
experience. As always, the candidate was given 
a test to evaluate his skills. The candidate’s 
design recommendation sheet was filled with 
terms such as memory load, mental load, con- 
ceptual load, syntactic learning load, and cog- 
nitive load. Such a narrow focus on the human 
side of users too does not help make a signifi- 
cant impact on the enterprise. 

• There is another advice (and therefore prac- 
tice) that leads to low-impact contribution. 
Usability practitioners have been inspired into 
believing that even a small usability improve- 
ment is better than no improvement at all. That 
might sound like good advice. But, following 
this advice only results in mediocre practitio- 
ners, mediocre applications, and therefore a 
poor image for the whole usability fraternity. 

If usability practitioners only deliver low-impact 
contributions, how will enterprises take them seri- 
ously? Practitioners need to rethink the current 
thinking and practices in usability. 

PROMOTE HIGH-IMPACT 
CONTRIBUTIONS 

Most of the time, the usability fraternity just talks 
about the small improvements that it creates. These 
“small improvements” are things that do not signifi- 
cantly impact the enterprise. These are things that 
are not perceived as significant by the enterprise. 

Instead, practitioners should start talking about 
big things (of course, assuming they have achieved 
big things). For example, if they redesigned an 
application user interface to significantly reduce 
expenses on user-training, they should talk about it 
and preferably in dollar terms. 

Usability practitioners know that users get con- 
fused, frustrated, dissatisfied, and so forth while 
interacting with poorly designed user interfaces. 
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However, those are not the terms that typically get 
the attention of business folks. So, practitioners should 
understand and articulate the business impact of such 
user reactions. 

Practitioners, however, should not use a preten- 
tious word to describe what they do. One company 
the author worked with had a user interface “Archi- 
tect,” who pulled out her screen visual layout tem- 
plate — when in fact the author was talking about 
user interaction architecture! 

Also, if a practicing team has the capability to 
design high-impact user interfaces, it should not try 
to cost-justify its efforts. Instead, it should talk about 
Return on Investment (ROI). The phrase cost- 
justification might not sound different from ROI, but 
it is. Sure, methods like cost-benefit analysis, cost- 
justification, and ROI projection all attempt to pre- 
dict the financial and other business consequences of 
an action. However, when terms like cost-justifica- 
tion are mentioned, attention is directed to cost. On 
the other hand, when a term like ROI is mentioned, 
attention is directed to returns or gains. Merholz and 
Hirsch (2003) say, “The key strategy is to get enter- 
prises to recognize that user experience is not simply 
a cost of doing business, but an investment — that 
with appropriate expenditure, you can expect a finan- 
cial return.” 



FUTURE TRENDS 

Usability practitioners will benefit from reports that 
are based on research projects regularly conducted 
on how well and how frequently they touch enter- 
prise bottom-line. That would help the practitioners 
know how well the field is doing from the enterprise 
perspective so they could focus on areas that matter 
most to the enterprise. 

CONCLUSION 

It is sad that usability, which has the potential to bring 
a big business value, has so far not made an impact 
on (and therefore is not taken seriously by) enter- 
prises that invest in IT. Therefore it has remained a 
small and struggling field for many decades. The 
blame is on the usability practitioners. The good 



news, however, is that with the right skills and the 
right approach, usability practitioners will not only 
keep their jobs at difficult economic times, but 
actually grow the field into a thriving industry. 

REFERENCES 

Bias, R. G., & Mayhew, D. J. (1994). Summary: A 
place at the table. In R. G. Bias, & D. J. Mayhew 
(Ed.), Cost-justifying usability (p. 324). San Di- 
ego, CA: Academic Press. 

Henry, P. (2003, March & April). Advancing UCD 
while facing challenges working from 
offshore. Interactions, 38-47. 

Mauro, C. L. (n.d.). More science than art. Re- 
trieved September 4, 2004, from http:// 
www.taskz.com/usability_indepth.php 

Merholz, P., & Hirsch, S. (2003). Report review: 
Nielsen/Norman group ’s usability return on in- 
vestment. Retrieved eptember 4, 2004, from http:// 
www.boxesandarrows.com/archives/ report_ 
review_nielsennorman_groups_usability_return_on_ 
investment.php 

Orenstein, D. (1999, August). Is software too hard 
to use? CNN.com. Retrieved October 4, 2004, from 
http://www.cnn.com/TECH/computing/9908/25/ 
easyuse.ent.idg/index.html 

Shneiderman, B., Tremaine, M., Card, S., Norman, 
D. A., & Waldrop, M. M. (2002, April 20-25). 
CHI @20: Fighting our way from marginality to 
power. Extended Abstracts: Conference on Hu- 
man Factors in Computing Systems, Minneapolis, 
Minnesota (pp. 688-691). New York: ACM Press. 

KEY TERMS 

Application: Computer software meant for a 
specific use such as payroll processing. 

Business and IT Executives: Senior people 
at a business organization, such as the chief infor- 
mation officer (CIO), chief technology officer 
(CTO), and chief executive officer (CEO). 
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Business Process: A series of related activi- 
ties performed by staff in a business organization to 
achieve a specific output (for example: loan pro- 
cessing). 

Enterprise: A business organization. 

Enterprise Performance and Bottom-Line: 

Data critical for business success, such as employee 
productivity, customer satisfaction, and revenues. 



Usability Practitioner: A person who designs 
and evaluates software user interfaces. 

Usability Fraternity: A group of people that 
designs and evaluates software user interfaces. 



629 



630 



Ubiquitous Computing and the Concept of 
Context 

Antti Oulasvirta 

Helsinki Institute for Information Technology, Finland 

Antti Salovaara 

Helsinki Institute for Information Technology, Finland 



INTRODUCTION 

Mark Weiser (1991) envisioned in the beginning of 
the 1990s that ubiquitous computing, intelligent small- 
scale technology embedded in the physical environ- 
ment, would provide useful services in the everyday 
context of people without disturbing the natural flow 
of their activities. 

From the technological point of view, this vision 
is based on recent advances in hardware and soft- 
ware technologies. Processors, memories, wireless 
networking, sensors, actuators, power, packing and 
integration, optoelectronics, and biomaterials have 
seen rapid increases in efficiency with simultaneous 
decreases in size. Moore’s law on capacity of 
microchips doubling every 18 months and growing 
an order of magnitude every five years has been 
more or less accurate for the last three decades. 
Similarly, fixed network transfer capacity grows an 
order of magnitude every three years, wireless 
network transfer capacity every 5 to 10 years, and 
mass storage every 3 years. Significant progress in 
power consumption is less likely, however. Innova- 
tions and breakthroughs in distributed operating en- 
vironments, ad hoc networking, middleware, and 
platform technologies recently have begun to add to 
the ubiquitous computing vision on the software side. 

Altogether, these technological advances have a 
potential to make technology fade into the back- 
ground, into the woodwork and fabric of everyday 
life, and incorporate what Weiser (1991) called 
natural user interfaces. Awareness of situational 
factors (henceforth, the context) consequently was 
deemed necessary for this enterprise. This article 
looks at the history of the concept of context in 
ubiquitous computing and relates the conceptual 
advances to advances in envisioning human-com- 
puter interaction with ubiquitous computing. 



BACKGROUND 

Ubiquitous Computing Transforms 
Human-Computer Interaction 

Human-computer interaction currently is shifting its 
focus from desktop-based interaction to interaction 
with ubiquitous computing beyond the desktop. Con- 
text-aware services and user interface adaptation 
are the two main application classes for context 
awareness. Many recent prototypes have demon- 
strated how context-aware devices could be used in 
homes, lecture halls, gardens, schools, city streets, 
cars, buses, trams, shops, malls, and so forth. 

With the emergence of so many different ways 
of making use of situational data, the question of 
what context is and how it should be acted upon has 
received a lot of attention from researchers in HCI 
and computer science. The answer to this question, 
as will be argued later, has wide ramifications for the 
design of interaction and innovation of use purposes 
for ubiquitous computing. 

HISTORY 

Context as Location 

In Weiser’s (1991) proposal, ubiquitous computing 
was realized through small computers distributed 
throughout the office. Tabs, pads, and boards helped 
office workers to access virtual information associ- 
ated to physical places as well as to collaborate over 
disconnected locations and to share information 
using interfaces that take locational constraints sen- 
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sitively into account. Although Weiser ( 1991) never 
intended to confine context to mean merely location, 
the following five years of research mostly focused 
on location-based adaptation. Want et al. (1992) 
described the ActiveBadge, a wearable badge for 
office workers that could be used to find and notify 
people in an office. Weiser (1993) continued by 
exploring systems for sharing drawings between 
disconnected places (the Tivoli system). Schilit et al. 
(1994) defined context to encompass more than 
location — to include people and resources as well — 
but their application examples were still mostly 
related to location sensing (i.e., proximate selection, 
location-triggered reconfiguration, location-triggered 
information, and location-triggered actions). Want, 
et al. (1995) added physical parameters like time and 
temperature to the definition. Perhaps the best- 
known mobile application developed during this lo- 
cation paradigm era was the CyberGuide (Long et 
al., 1996), an intelligent mobile guide that could be 
used to search for nearby services in a city. This 
paradigm was also influential in the research on 
Smart Spaces, such as intelligent meeting rooms. 



that assume some of the control responsibility from 
the users. 

The latest widely referred to definition was given 
by Dey et al. (2001), who defined context as “any 
information that characterizes a situation related to 
the interaction between users, applications, and the 
suiTounding environment” (p. 106). Satyanarayanan’s 
(2001) formulation of pervasive computing also be- 
longs to this line of thinking, but the author has 
chosen to avoid defining context and merely admits 
that it is rich and varies. 

In his review of context definitions over the 
years, Dourish (2004) calls this the representa- 
tional approach to context. Recent work within 
this branch has come close to finding the limits to 
recognizing and labeling contexts. For example, 
simple physical activities of a person in a home 
environment can be recognized with about 80-85% 
accuracy (Intille et al., 2004), as can be the 
interruptability of a person working in an office 
(Fogarty et al., 2004). Some critics have drawn 
parallels from this enterprise to problems encoun- 
tered in strong Al (Erickson, 2002). 



u 



The Representational Approach to 

Context FUTURE TRENDS 



Although the idea that location equals context was 
eventually dismissed, many researchers coming from 
computer science still believed that contexts were 
something that should be recognized, labeled, and 
acted upon (Schmidt et al., 1998). Here, context was 
supposed to be recognized from sensor data, labeled, 
and given to applications that would use it as a basis 
for adaptation. Dey et al.’s (1999) five Ws of 
context — Who, Where, When, What, and Why — 
extended this approach and demonstrated convinc- 
ing examples of how a labeled context could be used 
for presenting, executing, and tagging information. 
Tennenhouse’s (2000) proactive computing para- 
digm endorsed a similar way of thinking about con- 
text, emphasizing the role of computers in doing real- 
time decisions on behalf of (or pro) the user. A 
somewhat similar approach that also attempts to 
delegate decision-making responsibility to intelligent 
systems is taken by the Ambient Intelligence (Ami) 
technology program of the European Union (ISTAG). 
One part of the Ami vision entails intelligent agents 



New Directions Inspired by Human and 
Social Sciences 

By the year 1996, other approaches to context were 
beginning to emerge. Wearable computing (Mann, 
1996) looked at personal wearable computers able to 
help us remember and capture our everyday experi- 
ences through video and sound recording of context. 
Tangible bits (Ishii & Ullmer, 1997), although in- 
spired by ubiquitous computing, looked at context not 
as something that had to be reacted upon but as 
surroundings of the user that could be augmented 
with tangible (i.e., graspable) computers and ambi- 
ent media that display digital information using dis- 
traction-free output channels. 

More recently, researchers have started to em- 
phasize the social context and issues in people’s 
practices and everyday conduct. These approaches 
give special consideration to activities that people 
engage in and highlight their dynamic nature, differ- 
ent from the labeling-oriented representational ap- 
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proach. Activity-centered approaches emphasize 
both turn taking in communication between the user 
and the applications (Fischer, 2001) and acknowl- 
edge the situated and time-varying nature of the 
needs that a user has in his or her life (Greenberg, 
200 1 ). This line of research highlights the difficulties 
that exist in making correct inferences about a user’ s 
tasks through sensor information. Considerations of 
social issues in ubiquitous computing design include 
questions of how to fit computation intelligence into 
people’ s routines in an unremarkable manner (Tolmie 
et al., 2002) and how people’s patterns of interaction 
with humans and computers change when 
computationally augmented artifacts are adopted into 
use. Yet another emerging idea from F1CI addresses 
specifically the aim to be free from distraction; that 
is, when it is appropriate to interrupt the user at his or 
her present task. Some call systems with such infer- 
ring capabilities attentive user interfaces (Vertegaal, 
2003). 

CONCLUSION 

HCI in ubiquitous computing has been both inspired 
and constrained by conceptual developments regard- 
ing the concept of context. Weiser’s (1991) initial 
work caused researchers to conceive context nar- 
rowly as encompassing mainly location and other 
static, easily measurable features of a user’s con- 
text. After about five years of research, the restric- 
tiveness of this definition was realized, and broader 
definitions were formulated. Still, context mainly was 
pursued by computer scientists and seen as some- 
thing that must be labeled and reacted upon to adapt 
user interfaces. More recent work by human and 
social scientists has emphasized the role of user 
studies and theoretical reasoning in understanding 
what context entails in a particular application. 
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Context: That which surrounds and gives mean- 
ing to something else. (Source: The Free On-line 
Dictionary of Computing, http ://foldoc . doc .ic . ac.uk/ 
foldoc/) 
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Peripheral Computing: The interface attempts 
to provide attentionally peripheral awareness of 
people and events. Ambient channels provide a 
steady flow of auditory cues (i.e., a sound like rain) 
or gradually changing lighting conditions. 

Pervasive Computing: Technology that pro- 
vides easy access to information and other people 
anytime and anywhere through a mobile and scal- 
able information access infrastructure. 



Proactive Computing: A research agenda of 
developing interconnected devices and agents, 
equipped with faster-than-human-speed computing 
capabilities and means to affect real-world phenom- 
ena that a user can monitor and steer without a need 
to actively intervene in all decision-making situa- 
tions. By raising the user above the traditional hu- 
man-computer interaction loop, efficiency and free- 
dom from distraction are expected to be enhanced. 

Tangible Bits: According to Hiroshi Ishii of 
MIT, “the smooth transition of users’ focus of 
attention between background and foreground using 
ambient media and graspable objects is a key chal- 
lenge of Tangible Bits” (p. 235). 

Tangible User Interfaces: Systems that give a 
physical form to digital information through aug- 
menting tools and graspable objects with computing 
capabilities, thus allowing for smooth transitions 
between the background and foreground of the 
user’s focus of attention. 



KEY TERMS 

Attentive User Interfaces: AUIs are based on 
the idea that modeling the deployment of user atten- 
tion and task preferences is the key for minimizing 
the disruptive effects of interruptions. By monitoring 
the user’s physical proximity, body orientation, eye 
fixations, and the like, AUIs can determine what 
device, person, or task the user is attending to. 
Knowing the focus of attention makes it possible in 
some situations to avoid interrupting the users in 
tasks that are more important or time-critical than 
the interrupting one. 



Unremarkable Computing: An approach that 
focuses on designing domestic devices that are 
unremarkable to users. Here, unremarkable is un- 
derstood as the use of a device being a part of a 
routine, because, it is believed, routines are invisible 
in use for those who are involved in them. 

Wearable Computing: Technology that moves 
with a user and is able to track the user’s motions 
both in time and space, providing real-time informa- 
tion that can extend the user’s knowledge and 
perception of the environment. 
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INTRODUCTION 

Let’s remember the first films that started to show 
the broad public futuristic communication scenarios, 
where users were able to exchange almost any kind 
of information to communicate with anyone at any 
place and at any time, like Marc Daniels’ “Star 
Trek” in the 1960s and James Cameron’s “Termina- 
tor” in the 1970s, for example. The consequence of 
this was that impersonalized spaces (e.g., airports) 
(Auge, 1992) could easily become a personalized 
environment for working or leisure, according to the 
specific needs of each user. 

These kinds of scenarios recently have been 
defined as ubiquitous communication environments. 
These environments are characterized by a system 
of interfaces that can be or fixed in allocated posi- 
tions or portable (and/or wearable) devices. Ac- 
cording to our experience with 2G technologies, we 
can foresee that the incoming 3G communication 
technologies will make sure, however, that the sec- 
ond typology of interfaces will become more and 
more protagonist in our daily lives. The reason is that 
portable and wearable devices represent a sort of 
prosthesis, and therefore, they reflect more than 
ever the definition of interface as an extension of the 
human body. When in 1973 Martin Cooper from 
Motorola patented an interface called Radio Tele- 
phone System (which can be defined as the first 
mobile phone), he probably didn’t suspect the sub- 
stantial repercussion of his invention in the human 
microenvironment and in its social sphere. The 
mobile phone, enabling an interpersonal communica- 
tion that is time- and place-independent, has changed 
humans’ habits and their way of making relation- 
ships (Rheingold, 1993). This system made possible 
a permanent and ubiquitous connection among us- 
ers. At the same time, it has made users free to 
decide whether to be available or not in any moment 
and in any place they might be (Hunter, 2002). 

This article is based on empirical work in the field 
with network operators (Vodafone) and handset 



manufacturers (Nokia) and research at the 
Politecnico di Milano University, the University of 
Lapland, and the University of Brighton. The inten- 
tion is to give a practical approach to the design of 
interfaces in ubiquitous communication scenarios. 

BACKGROUND 

The methodologies and guidelines for the HCI de- 
sign for handhelds initially were imported from the 
general theories of HCI for Web (Nielsen, 2000). 
Only after 1999 did this issue start to gain relevance 
as a research area. This can be reflected in the 
proliferation of focused conferences such as Mobile 
HCI started as a workshop in 1999 and has been 
explicitly treated in more holistic HCI conferences 
(e.g., CHI, HCI, Interact , Ubicomp, etc.). Unfortu- 
nately, the literature in this area is still scarce 
(Beaulieu, 2002; Bergman, 2000; Burkhardt, 2002; 
Hunter, 2002; Stanton, 2001; Weiss, 2002). 

INTERNET MOBILE AND 
MOBILE COMMUNICATION 

The Internet is related to a virtual space in which it 
is possible to interact with information. Mobile 
Internet, however, has represented an evolution of 
the concept of utopical (no real space) interaction to 
the concept of topical interaction, in which interac- 
tion (still with a virtual information space) happens in 
real places (Benedikt, 1991). This simultaneous 
presence of utopical and topical interaction makes 
necessary a direct relationship between both ambits 
(e.g., thanks to the GPS, what happens in real space 
must have an effect on the virtual one and vice 
versa). The communication now becomes space- 
sensitive or, better, context-sensitive. 

Mobile communication is a broader concept than 
mobile Internet, as it embraces not only the connec- 
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tion to the net (intranet or extranet) but also voice 
and messaging (SMS, EMS, MMS) (Cereijo, 2001). 



USAGE OF MOBILE 
COMMUNICATION 

In the 1980s, the first generation (1G) of mobile 
communication systems revolutionized the TLC world, 
as users could carry a phone in their pocket. The 2G 
communication system and its new protocols to ac- 
cess the Internet, beyond just voice calls, provided 
users in mobility with a whole range of interactive 
services based on wireless data transmission. 

Today, the market is characterized by different 
technologies — in America and Japan, the IS95 net- 
work based on CDMA (Code Division Multiple 
Access); in Europe, Asia, and Africa, the GSM 
(Global System for Mobile Communication) using 
TDMA (Time Division Multiple Access). Analysts 
foresee a relevant growth of mobile Internet users in 
the upcoming years — by 2005, more mobile phones 
will be connected to the Internet than PCs (Ovum, 
3G Mobile, 2000). 

Now we see the advent of 3G and 4G systems 
offering unprecedented bandwidth and speed con- 
nection up to 2 Mbps for data transmission with 
audio and video streaming capabilities directly on the 
phone. The variety and difference of the services 
offered are a challenge for today’s service and 
application developers, and the battlefield is usability 
and effectiveness (Cereijo, 2002). 



MULTI-ACCESS AND MULTI- 
CHANNEL CONVERGENCE 



information) and between different multilingual de- 
vices available. That means that a user’s transaction 
with a certain interface (e.g., flight booking from an 
iTV setup box) also must appear in real time, if the 
user accesses the related site afterwards with a 
different interface (e.g., PC or Pocket PC). The 
problem of the convergence implies some other 
ones, such as the information must be optimized 
according to the physical and technical features of 
each interface (Cereijo, 2003). 

One of the main consequences of 3G will be an 
enhanced interaction with information (companies 
and institutions), people (personal and group com- 
munication), the smart-house and the automated 
office. This context of ubiquitous communication 
(across mobile phones, iTV, palms, pocket PCs, 
PDAs, etc.) will have applications in domotics, 
videoconferencing, commerce, iTV, entertainment, 
learning, finance, medicine, and so forth (Burkhardt, 
2002 ). 
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MULTI-CHANNEL IDENTITY 



One of the challenges of 3G will be the design of the 
multi-channel identity. Each type of device has 
different technical and physical features that condi- 
tion the design decisions (regarding architecture, 
navigation, contents, and graphics). 

This requires a coordinated graphic and interac- 
tion design that takes these issues into account. At 
the same time, the peculiarities of each interface of 
the system (Figure 1) might make the achievement 
of desired design homogeneity difficult (from both 
the functional and visual point of view) (Bergman, 
2000 ). 



3G will be able to merge (at least) four media 
(Internet, SMS, TV, Smart-home). It is obvious that 
it will be crucial to offer an integrated system of new 
services with a perceived added value for the user in 
mobility. This integration also is called convergence 
and implies that all the information exchanged in the 
system (independently from the device of access) 
somehow must be centralized. The concept of con- 
vergence is related to that of the interoperability of 
the components of the same platform (e.g., the 
agenda, e-mail, block notes must share the same 



Figure 1. 
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USABILITY AND SELF-USABILITY 

The usability of a system can be defined as the 
effectiveness, efficiency, and satisfaction with which 
specified users achieve specified goals in particular 
environments (ISO 9241, ergonomic requirements 
for office work with visual display, Part 11). This 
means that the measurement of users’ performance 
in mobility is not only important, but also their satis- 
faction, which is related to their perceived relevance 
of the mobile services provided in the particular 
contexts of use. Now more than ever, a participative 
design process (synergetic cooperation between the 
designer and the user during the whole cycle of the 
creation of the product) can be appropriated to 
reduce design and development costs as well as to 
provide successful services (Jordan, 2000). 

There is another aspect that concerns usability 
that deserves special regard. It can be defined as 
self-usability and is related to the emotional effects of 
this extended way of multi-access services commu- 
nication (e.g., iTV users by interacting with distrib- 
uted interfaces). In order to make technology friend- 
lier, mobile users have developed the so-called new 
m-language of the mobile interactive community. It is 
made of slang words, neologisms, acronyms, num- 
bers, and icons of cyber communication. This new 
way of expressing was born as a consequence of 
interacting with SMS, e-mail, chat, forums, and 
newsgroups, and evolved with WAP and GPRS 
communication (Macleod, 1994). 

MICRODESIGN 

The interface is not a physical object but the space in 
which the interaction among the human body, de- 

Figure 2(a). 



vices, and aim of the action happens. Therefore, in 
an optic of human-centered microdesign, the inter- 
est of the design director is not the single button, 
display, or screen, but the global project of the 
human-machine interface (Nielsen, 1992). Inter- 
acting with a wired device is related to a specific 
user experience; usually, the contents are fat, and 
audio-visual factors are relevant. The navigation 
here has several possible schemes due to the large 
size and rich color of the screen. Moreover, ani- 
mated images and videos can enrich and facilitate 
interaction. Accessing the Internet through mobile 
devices instead does not replicate the PC-based 
access. In fact, compared with the PC, in this kind 
of interface, we encounter important limitations. We 
can mention the small size of the interactive area, the 
low resolution, the high cost of the device and that of 
the connection, the slowness of data entry, the 
reduced memory storage, the limited processing, and 
the short battery autonomy (Weiss, 2000). 

Another aspect that contributes to increase the 
aforementioned complexity in microdesign is re- 
lated to the shape of the device. In fact, two main 
trends can be denoted — in one case, navigation 
tools have been gradually transferred to the graphic 
interface and voice commands (i.e., PDAs, Palms, 
and table PCs, where keyboards have been almost 
eliminated in favor of the screen [Figure 2a]); and 
in the other case, the device tends to configure itself 
as a typical mobile phone (Figure 2b). 

In any case, it is crucial to be able to develop 
services that offer positive experiences to end 
users independently from these technology limita- 
tions. Information retrieval will no more be a frus- 
trating experience, but users will be satisfied in less 
time throughout a logical sequence with minimum 
effort (Picard, 2000). 

Figure 2(b). 
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From GSM to 3G systems, wireless devices are 
driving the mobile Internet evolution, opening up 
innovative ways of communicating and interacting 
with content. The knowledge of this interaction 
context is crucial in order to be able to design and 
develop multi-access services. The uncertainty fac- 
tor regarding any technology’s most appropriate 
future use in terms of usage or shape can be reduced 
with a good knowledge of the utilization scenario. 

Wireless communication introduces an innova- 
tive way to get the best out of the information world. 
The key aspect is mobility — anytime-anywhere ac- 
cess means quick and successful end-user experi- 
ence in getting the right information at the right time 
and in the right place (Leed, 1991). On the other 
hand, mobile devices are personal tools, they contain 
personal information (agenda, phone book), and they 
can be personalized. Moreover, they remain on most 
of the day and night and are carried anywhere as an 
absolutely needed connecting device. They not only 
enhance communication, they also change our social 
relations (Ravy, 2000). 

The first step will be the definition of scenarios of 
ubiquitous interactive communication (UMTS, MMS, 
iTV, etc.) to prospect multi-access communication 
contexts where users will be able to interact with 
people, information, TV, other devices and ma- 
chines, and so forth — independently of place and 
time — thanks to a system of distributed interfaces 
(PC, TV, PDA, pocket PC, mobile phone, etc.) 
(Cereijo, 2003). Once the 3G scenario and the 
context of use are well defined, the designer should: 

• Know the expectances, limitations, and behav- 
ior of users in mobility. 

• Set a user-centered design. 

• Design right interaction models and patterns. 

• Foresee transcultural adaptability. 

• Ensure a good communication of the services 
provided in order to guarantee a satisfactory 
added value to the user. 

• Keep a coherent and relevant multi-channel 
identity. 

• Recognize the level of usability of an interface 
according to: 

• Self-learning and length of the training 
phase. 



• Satisfaction of users’ expectations and 
users’ perceived value of the services 
provided. 

• Speed in task completion. 

• Number of irreversible mistakes. 

• Degree of interaction enabled. 
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The designer must know who is the user in 
mobility in order to know the user’s limitations 
(Wharton, 1992), needs, and expectations; to hy- 
pothesize the user’s behavior; to predict a model of 
multi-access interaction; to measure (Sears, 1993) 
the user’s performance and the emotive level pro- 
vided; to optimize human interface (Shneiderman, 
1987); and to find a balance between automation and 
creative-affective interaction (deals with the plea- 
sure of the user’s “savoir faire”). 

According to the European experience with mo- 
bile interactive communities that use WAP and 
SMS, users appreciate enhanced interaction, per- 
sonal, always-on and immediate communication (in- 
formation delivered with sensibility to time and con- 
text), new emotional experiences, new ways of 
socializing and sharing experiences, and new ex- 
pressions of entertainment (e.g., multiplayer games, 
group iTV interaction). An adequate targetization of 
users in terms of lifestyle, needs, behavior, and role 
in a social group will lead to a successful personal- 
ization of the devices. 

In order to provide users with a high perceived 
value of mobile interfaces, it is crucial to develop a 
human-centered microdesign approach that leads to 
innovative interaction schemes (accessible via multi- 
devices: PC, smart phones, PDA) and site architec- 
ture as well as intuitive navigation. A scant attention 
to human factors and behavioral science principles 
(learning ergonomics, social psychology , biomechan- 
ics, and HCI), which definitely is detrimental in Web 
design, is fatal in the mobile Internet. For example, 
users’ frustrations in WAP is undoubtedly more 
dangerous than on the Web, as mobile devices 
interfaces are much smaller, and delivery of service 
is slower and more expensive (Cooper, 1995). 

Even though there still are no standards for 
usability in mobile Internet, it is possible to indicate 
some basic guidelines for a user-centered 
microdesign. 



637 




Ubiquitous Internet Environments 



INFODESIGN 

Due to space economy, naming is a basic issue in 
microdesign (the use of a concise but relevant, auto- 
evident and consistent language for contents and the 
name of the navigation elements). It also requires a 
strong effort of organization and hierarchy of con- 
tent (providing the user with progressive levels of 
deepening). Especially for mobile phones, where 
graphics have a discrete presence, most of the 
attention of design is focused on contents. This brief 
way of communicating must consider, however, that 
users have a limited memory, and not all information 
in the world can be laid down. All sections must be 
complete with all the information that users would 
expect to find (e.g., companys address, etc.). The 
Infodesign also must guarantee a maximum level of 
interaction (e.g., if, after carrying out a business 
search for a hotel in London, both the e-mail address 
and the telephone number displayed must be per- 
ceived intuitively by the user as interactive links in 
order to indicate that the e-mail can be sent directly 
and/or the number can be called just by clicking it). 
Figure 3 shows a case of auto-evident content; a 
number in brackets expresses how many sections 
are contained in each link. 



MULTI-PLATFORM ARCHITECTURE 

The structure of the site is a complex framework 
that reflects the multi-device access (e.g., WEB and 
WAP). The mobile Internet site’s architecture must 
be in harmony with the user’s workflow, and its 
layout must make the content easily accessible. 
Internet designers build the site’s structure once the 
information has been hierarchized and organized 
into different levels. This process is stressed in 
mobile Internet. In fact, in microdesign, it is crucial 



Figure 3. 




to reach a good balance between horizontal and 
vertical organization (according to our research, for 
mobile phones and PDAs, not more than three 
mouse clicks — stages — are advisable in order to 
reach any service). As a general rule, we can say 
that vertical scrolling is easy on most devices; hori- 
zontal is not (Brewster S., 1999). 



GRAPHICS 

Ergonomically, such a small screen implies that 
more than in any other interface, graphics must aid 
navigation (the relationship between a user’s inten- 
tions and the outcome must be intuitive) and compre- 
hension of contents. Graphics that reduce readabil- 
ity must be avoided. Using multi-device graphics 
that are independent of the device to keep commu- 
nication coherence compels us to use some crucial 
tricks (e.g., using contrasted monochrome icons, 
avoiding scrolling, etc.) in order to not lose in effi- 
ciency and efficacy when using the same graphic 
elements with different interfaces. It is common to 
find sites for PDAs with unreadable text (i.e., type 
too small, color of the type, color of the background, 
etc.). Moreover, the use of some graphic elements 
can be useful to differentiate different kinds of 
contents (e.g., HiuGO’s WAP site uses the string to 
entitle a related section). Finally, graphic elements 
should contribute to service personalization (Reyes, 
2001 ). 

NAVIGATION 

First, it can be denoted an evolution from the first 
models of interfaces for mobile Internet that allowed 
only one way (sequential menu) scrolling to the 3G 
devices that permit bidirectional movement (matrix 
menu) by means of a rocket, central key, joystick, or 
keypad. In any case, basic principles of mobile 
Internet navigation as evidence, fluidity, and quick- 
ness won’t suffer many changes. In fact, navigation 
must allow users to satisfy easily their expectations 
and needs (relevance) in order to reach easily and 
quickly any section of the site. All pages must have 
the basic navigation elements (back, top, home, 
etc.); they must be evident and it must avoid confu- 
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sion between buttons and non-linkable icons (this 
problem is more visible in the case of devices with 
touch screens, where linkable icons undoubtedly can 
help navigation). Mobile users hate to lose their way 
within the site, and they don’t appreciate dead links, 
which can affect a site’s reliability). 

Some recent researches on users’ behaviors put 
into evidence a reluctance of utilization of the physi- 
cal buttons of mobile devices as function keys (only 
some expert users are prone to make current use of 
them). Therefore, only a minimum number of navi- 
gation functions or commands should be placed in 
the options menu (the most frequent ones); these 
commands also should be placed (with intentioned 
redundancy) on the screen, together with the rest of 
the functions and links or in physical buttons. 



FUTURE TRENDS 

According to the outputs in conferences such as 
Mobile HCI and Ubicomp, there are three main 
trends for the future of mobile HCI. The first one still 
remains the improvement usability and accessibility 
of mobile interfaces that are more and more “rich 
brains in poor bodies.” The second one regards the 
solution of new dilemmas as the dichotomies person- 
alization/privacy, ubiquity/security, context-aware- 
ness/confidentiality, info-accessibility /info-overload, 
effective-communication/affective-communication, 
and so forth. Finally, multi-platform systems imply 
that synchronization of information among multiple 
devices presents a challenge. This makes more 
realistic the dream of ubiquitous communication, but 
the consequent complexity from the interactive point 
of view needs to be hidden. 



CONCLUSION 

The incoming 3G-communication scenario will place 
users more and more in the center of a holistic 
communication network, which implies being physi- 
cally surrounded by devices that will interact with 
users and machines. This environment of ubiquitous 
communication is becoming a mass-consuming phe- 
nomenon, where users might interact with different 
interfaces at the same time, which implies the need 



to create shared universal interaction codes, a co- 
herent language between all holistic systems of 
interfaces . Innovative design patterns, human factor 
studies, behavioral theories, and evaluation tech- 
niques in ubiquitous communication scenarios have 
been investigated in order for these technologies to 
enjoy widespread popularity and usage. 
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KEY TERMS 

Infodesign: A broad term for the design tasks of 
deciding how to structure, select, and present infor- 
mation. 

Microdesign: The process of designing how a 
user will be able to interact with a small artifact. 

Mistake: An error of reasoning or inappropriate 
subgoals, such as making a bad choice or failing to 
think through the full implications of an action. 

Multi-Access: Interacting with a computer us- 
ing more than one input or output channel at a time, 
usually suggesting drastically different input chan- 
nels being used simultaneously (e.g., voice, typing, 
scanning, photo, etc.). 

Multi-Channel: Different interfaces that can 
be available to the user for data entry in a multi- 
platform system (iTV, PC, mobile phone, smart- 
phone, pocket-PC, etc.). 

Multi-Channel Identity: A perceived commu- 
nicational coherence for each service provided 
through the whole system of interfaces. 

Reversible Actions: Any action that can be 
undone. Reversibility is a design principle that says 
people should be able to recover from their inevitable 
mistakes. 

Self-Usability: Sort of mechanisms set by users 
(e.g., use of acronyms in a SMS) in order to make 
more usable the interaction with complex human 
artifacts. 
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INTRODUCTION 

Usability inspection method (UIM) is the term used 
for a variety of analytical methods designed to “find” 
usability problems in an interface design. The basic 
principle involves analysts inspecting the interface 
against a set of pre-determined rules, standards or 
requirements. Analysts inspect the interface and 
predict potential usability problems based on 
breaches of these rules. None of the UIMs currently 
in use are capable of detecting all of the problems 
associated with an interface. After describing some 
of the UIMs in use, this article will look at the 
authors’ work on improving these methods by focus- 
ing on the resources analysts bring to an inspection. 

BACKGROUND 

In order to better explain the work we have done on 
improving UIMs, three of the more commonly used 
UIMs will be described. These are by no means the 
only usability inspection methods; other examples 
being claims analysis (Carroll & Rosson, 1991) and 
pluralistic walkthroughs (Bias, 1994). 

Heuristic Evaluation 

This method was developed by Nielsen (1992). The 
basis of the method is the comparison of an interface 
with a set of usability guidelines, known as the 
heuristics. Originally nine, there are currently 10 
guidelines dealing with areas such as visibility of 
system status, user control and freedom and error 
prevention. 

A heuristic evaluation is carried out by a number 
of different evaluators; five is recommended as the 



optimal number (Nielsen &Landauer, 1993), and the 
problems identified by the individual evaluators are 
then merged into a master problem set. 

This technique has been used at numerous stages 
in the development process from paper prototypes to 
full software packages (Nielsen, 1990). The advan- 
tages of the technique, and the reason the method is 
so popular, are that it can be used by novices as well 
as experts, although novices find fewer problems 
than experts (Nielsen, 1992) and the technique is 
comparatively quick and inexpensive to employ. The 
disadvantage is that it tends to only uncover more 
superficial problems with an interface; problems 
that require complex interaction on the part of the 
user are more likely to be missed by heuristic 
evaluation. 

Cognitive Walkthrough 

This technique is based on the CE+ [This is a 
combination of Cognitive Complexity Theory (CCT) 
(Kieras & Poison, 1985) and Explanation-based 
Learning (EXPL) (Poison & Lewis, 1990)] theory of 
exploratory learning. This theory states that users 
exploring a new interface are guided by general task 
goals, and they search for interface elements that 
promise to move them closer to these goals. Cogni- 
tive Walkthrough is a practical technique for apply- 
ing CE+ in an evaluation and was fully outlined in 
Wharton, Rieman, Lewis, and Poison (1994). In 
contrast to Heuristic Evaluation, Cognitive 
Walkthrough can only be performed by experts. 

The technique focuses on how well an interface 
can support a novice user without formal training. A 
Cognitive Walkthrough is usually performed by the 
interface designer with a small group of colleagues. 
It requires that certain information be available to 



Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited. 




Understanding and Improving Usability Inspection Methods 



the evaluators to be successful, including a descrip- 
tion of the users and their knowledge resources, a 
description of the tasks to be performed and the 
correct sequence of actions necessary to carry out 
the tasks. In performing the walkthrough, for each 
step in a task, the evaluation team asks a series of 
questions including: 

• Is the correct action obvious to the user? 

• Will the user match the system’ s response with 
the chosen action? 

Cognitive Walkthrough has been criticized for 
being too time consuming and requiring large amounts 
of paperwork to be completed, although attempts 
have been made to streamline the method (Rowley 
& Rhoades, 1992; Spencer, 2000). 

Heuristic Walkthrough 

Sears (1997) proposed a method which combines 
aspects of cognitive walkthroughs and heuristic 
evaluation to address the weaknesses of both. 

Heuristic walkthrough is a two phase technique. 
The first phase has similarities with cognitive 
walkthrough, in that evaluators have a set of ques- 
tions to guide their exploration of the interface as 
well as a set of common user tasks; this is designed 
to expose the evaluators to the core functionality of 
the interface. During the second phase, the evalua- 
tors use usability heuristics to assess problems with 
the interface. However, unlike a straightforward 
heuristic evaluation which is relatively unstructured, 
the use of the heuristics in a heuristic walkthrough is 
focused on those areas of the interface identified as 
important in the first phase of the evaluation. 

It is claimed that the major advantage of heuristic 
walkthrough is its ability to identify severe usability 
problems compared to heuristic evaluation while 
avoiding the narrow focus commonly associated 
with cognitive walkthroughs. 

Despite variations in the strengths and weak- 
nesses of the various UIMs, the unreliability of the 
assessment of all such methods is well documented 
(e.g., Gray & Salzman, 1998). In practice, these 
methods fail to predict all of the usability problems 
in a design; not all of the analysts’ predictions are 
true predictions. Such false predictions are com- 
monly known as false positives. For a variety of 



reasons, analysts will make predictions about usabil- 
ity problems that in reality cause no problems to the 
users (Cockton & Woolrych, 2001). 

For example, UIMs such as Heuristic Evaluation 
(Nielsen, 1992) are simply not good enough in their 
current state. The negative outcome of the use of 
such inspection methods is two-fold. First of all, 
because such methods are not thorough (they fail to 
find all of the usability problems), designs subjected 
to them can result in poor usability, especially if the 
nature of their flaws is not fully understood. For 
example, is there a type of usability problem that the 
method is typically good at finding? Or more impor- 
tantly, is there a type of problem the method is 
particularly bad at finding, and are these problems 
likely to be severe ones? Second, if the false posi- 
tives are addressed as real usability problems, time 
and money is wasted in the redesign of usable 
features. 

Although the assessment of UIMs has been very 
poor (Gray & Saltzman, 1998), this has improved 
recently (e.g., Cockton, Woolrych, Hall, & 
Hindmarch, 2003). Research must, therefore, focus 
on the reliable assessment of inspection methods 
before work on inspection method improvement can 
begin. 

Despite these problems with inspection methods, 
there is still a place for reliable UIMs given the 
original rationale for their development — saving valu- 
able resources such as time and costs. The chal- 
lenge is to improve UIM quality without increasing 
costs. 



IMPROVING USABILITY 
INSPECTION METHODS 

Thorough assessment of UIMs is reliant on accurate 
coding of analyst (non)-predictions. UIMs are com- 
monly assessed by their validity, thoroughness, and 
effectiveness (Sears, 1997), even though percent- 
ages fail to comprehensively assess UIMs (Woolrych 
& Cockton, 2000). 

Validity drops as the number of problems found 
with a UIM exceeds the real problems found. Ana- 
lysts make false predictions (false positives) as well 
as successful ones. Fewer false positives mean a 
more valid UIM. 
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Validity = Count of real problems found using UIM 1. 

Count of problems predicted by UIM 2 

3. 

The thoroughness of a UIM improves as more of 
the real problems that exist are actually found. 4 

Thoroughness = Count of real problems found using UIM 5 

Count of known usability problems 



Predicted (true positive) real problem 
Predicted, unconfirmed (false positive) 

Not Predicted (discovered and correctly elimi- 
nated) not a problem (true negative) 

Not Predicted (discovered and incorrectly 
eliminated) real problem (false negative) 
Genuine Miss (undiscovered real problem) 



u 



The effectiveness of a UIM is the weighted 
product of its thoroughness and validity (Hartson, 
Andre, & Williges, 2003). To calculate this accu- 
rately, we must correctly code all analysts’ predic- 
tions. This is not just to get the right percentages. To 
understand how and why false positives and genuine 
misses arise, we must first be able to properly code 
analysts’ predictions. 

The main focus of analysts’ (non)-predictions in 
previous assessments of UIMs has concentrated on 
just three outcomes: accurate predictions (hits), prob- 
lems missed by the analysts (misses) and false posi- 
tives (false alarms). 

Our research (Cockton et al., 2003) has led to the 
development of the DARe (Discovery Analyst Re- 
sources) model of analyst performance in usability 
evaluation. The model is simple. At its heart is the 
rigid distinction between the knowledge resources 
and methods used for problem discovery and problem 
analysis, where possible problems are confirmed or 
eliminated. It is called the DARe Model, after those 
knowledge resources that are critical to success with 
UIMs. Developments in the research pointed toward 
an analyst-centered approach. Several issues thus 
far were not being addressed in previous research. 

For instance, when the method and environment 
remained the same, why did some analysts discover 
some problems while other analysts did not? Did 
some analysts correctly eliminate some false posi- 
tives while others “kept” them? Did some analysts 
discover real usability problems and incorrectly elimi- 
nate them (false negative, previously coded as a 
genuine miss). 

In order to answer these questions, we have to 
consider the five possible outcomes from inspection 
and accurately code them (Cockton, Woolrych, & 
Hindmarch, 2004), while analyzing the problem dis- 
covery, strategy, and analysis resources adopted by 
individual analysts for potential problem analysis. The 
five possible outcomes from inspection are as follows: 



One and five are easy to explain. A true positive 
is simply an accurate prediction of a usability prob- 
lem by the analyst that in reality is a real usability 
problem to the user. A genuine miss is a real 
usability problem that has not been considered at all 
by the analyst; it has simply not been discovered. 
The third outcome (#2) is a false positive, where 
analysts believe an element or feature could cause 
user difficulties, but in real usage these difficulties 
do not occur. 

The final two outcomes (#3 and #4) are true and 
false negatives respectively. A true negative is 
where an analyst discovers a potential problem, 
analyses it, and correctly eliminates it (effectively 
eliminating a false positive before it is reported). A 
false negative is a potential problem that is discov- 
ered and is eliminated in analysis, but in real usage 
is a real problem. 

It is paramount to the accurate and thorough 
assessment of usability inspection methods that 
these outcomes are not miscoded. To address the 
issue of miscoding, two tools were developed. These 
were Extended Structured Problem Report For- 
mats (ESPRFs) and falsification testing (see next 
section). Without these tools there was a high risk 
of miscoding analyst (non)-predictions. For ex- 
ample, without ESPRFs, there was a risk of coding 
a genuine miss as a false negative and vice-versa. 
The ESPRF requires analysts to record all discov- 
ered problems, even if they are subsequently elimi- 
nated during analysis. Moreover, without ESPRFs, 
false negatives could not be coded at all, as analysts 
would not normally report improbable problems 
eliminated during inspection. 

Falsification testing (in detail next) is necessary 
to accurately code all of the possible outcomes. In 
simple terms, task sets are developed based on 
analyst (non)-predictions extracted from the 
ESPRFs. Each (non)-prediction is rigorously 
“stressed” in user testing in order to ensure accu- 
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rate coding; in other words, we can be confident that 
if a problem exists it will be found! 

Extended Structured Problem Report 
Formats (ESPRFs) 

ESPRFs were developed as research support tools. 
The original version was the Structured Problem 
Report Format (SPRF) (Cockton & Woolrych, 200 1 ). 
This version was developed to aid problem matching 
and extraction and intended to be used in both 
research and in practical use. The SPRF was de- 
rived from an analysis of usability problems, follow- 
ing from difficulties faced in the assessment of 
UIMs (Favery & Cockton, 1997; Cockton, & 
Atkinson, 1997, Cockton & lavery 1999). General 
principles from SUPEX such as transcription, seg- 
mentation, difficulty isolation, and generalization were 
all applied. 

The purpose of the SPRF was to address the 
issues associated with constructing a master prob- 
lem set from multiple analyst inspections. A simple 
description of reported problems was not enough. 
An individual’s description of the same predicted 
usability problem can vary greatly even to the point 
of appearing to be describing completely different 
events. The SPRF allowed for multiple points of 
reference to achieve accurate problem merging for 
the creation of a master problem set by requiring the 
analyst to report further detail about the reported 
problem. The SPRF required the analyst to provide 
a problem description but also to record the likely/ 
actual difficulties associated with the problem and 
also any specific contexts in which the problem 
occurred (if applicable) and the assumed causes of 
the problem. 

Experience has shown that, in many cases, prob- 
lem merging is possible using the problem descrip- 
tion alone. However, there were many instances 
where the problem description alone was insuffi- 
cient to confirm a confident match, but reference to 
other descriptions did enable problem matching. 

The current ESPRF (Cockton & Woolrych, 2001) 
was developed purely as a research tool. It is an 
extension of the SPRF, which is designed to extract 
analyst search strategy, knowledge resources used 
in problem analysis, and aid in the accurate coding of 
analyst (non)-predictions. There are four parts to the 



ESPRF, in which Part 1 consists of the original 
SPRF, for problem extraction and matching. 

Part 2 of the ESPRF addresses discovery re- 
sources and methods. Analysts were required to 
explain their discovery strategy, such as system, or 
user-centered, unstructured or structured. Essen- 
tially there are four strategies for problem discovery : 
system scanning, system searching, goal playing, 
and method following. The first two are system- 
centered, the first and third are unstructured. Differ- 
ent knowledge resources are required: little if any 
for system scanning; product knowledge for system 
searching; user/domain knowledge for goal playing; 
and task knowledge for method following. Part 2 
also addresses confirmation rationales for problems 
that analysts decide to retain as probable problems. 

Part 3 deals specifically with heuristic applica- 
tion to individual problems. Analysts were required 
to provide evidence of conformance rather than just 
name a heuristic (as was the case in the SPRF). 

Part 4 requires analysts to justify any problem 
elimination, with specific reference to user impact 
and behavior. The initial extensions focused on 
information that was clearly missing from the SPRF, 
that is, how analysts approach discovery and whether/ 
why elimination occurs. 

As well as preventing the miscoding of analyst 
(non)-predictions, the ESPRF also allows for the 
identification of analyst discovery strategy and rel- 
evant resources used in discovered problem analy- 
sis. ESPRFs are essential tools in usability inspec- 
tion method assessment; however, ESPRFs need 
the support of falsification testing for thorough analysis 
of usability inspection methods. 

FALSIFICATION TESTING 

The method for falsification testing (Woolrych, 
Cockton, & Hindmarch, 2004) involves the rigorous 
testing of analyst predictions via user testing. Ana- 
lyst predictions are analyzed and merged into a 
master problem set. Thorough analysis of the pre- 
dicted problem determines the individual difficulties 
with each problem. 

Within the context of the test application, task 
sets are systematically derived to expose these likely 
difficulties, that is, if the prediction is valid. Put 
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simply, the individual’s predicted problems are 
“stressed” via user testing to ensure a high level of 
confidence in final coding. 

Falsification testing is a fixed task user testing 
method. Users are restricted in their choice of task 
approach and execution. The goal of falsification 
testing is an accurate assessment of validity, that is, 
to accurately determine if a prediction is a “hit” or 
“false positive”, and, with the assistance of ESPRFs, 
to accurately code true/false negatives. 

The principle is simple, if a prediction is accurate, 
then it will be confirmed by user testing. If a predic- 
tion does not materialize as a problem, we can have 
confidence that it does not exist, and that the particu- 
lar prediction can be confidently coded as a false 
positive. Falsification testing ensures that false posi- 
tive coding of predictions is not a consequence of 
incomplete coverage in user testing. 



confidence in the accurate coding of all analysts’ 
(non)-predictions. Recorded rationales for problem 
confirmation/elimination extracted from ESPRFs 
can be used to identify appropriate knowledge re- 
sources for potential problem analysis. Analysis of 
such knowledge resources can identify those that 
aid analysts in appropriate problem confirmation/ 
elimination analysis. 

This is essentially the framework for the DARe 
model for analyst performance in inspection. Fur- 
ther work is necessary, however the DARe model 
can provide the focus of future research that will 
lead to improved assessment of usability inspection. 
In turn, positive inroads can be made in better 
analyst training with emphasis on knowledge re- 
sources for both problem discovery and analysis. 
Then we can consider improving individual usability 
inspection methods. 



u 



FUTURE TRENDS 

Previous attempts to “streamline” and “fix” usability 
inspection methods have been unsuccessful prima- 
rily due to the fact that assessment of UIMs was 
inadequate. The DARe model for analyst behavior 
and associated research has provided a more rigor- 
ous assessment of UIMs. Much more is known 
about the limitations of UIMs in general although 
work in this area is not yet complete. The weight of 
evidence in current research points towards an 
analyst-centered approach to usability inspection 
method improvement. 

Future work will concentrate on identifying the 
ideal search strategy for problem discovery during 
inspection. Also, further work is needed to identify 
and understand the knowledge resources that help 
analysts find and correctly analyze discovered us- 
ability problems. 

CONCLUSION 

As long as practitioners continue to use usability 
inspection methods in their current state, a likely 
outcome is products with unusable features, which is 
a discredit to the FICI community. One way forward 
is to better understand analyst behavior during in- 
spection. Understanding such behavior starts with 
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KEY TERMS 

False Negative: A potential usability problem 
discovered in a usability inspection that upon analy- 
sis is incorrectly eliminated by the analyst as an 
improbable problem. The discovered problem is 
confirmed in real use as causing difficulties to users. 

False Positive: A prediction of a usability prob- 
lem reported in a usability inspection that in reality is 
not a problem to the real users. 

Falsification Testing: A method for testing the 
accuracy of predictions made during usability in- 
spections. 

Genuine Miss: A usability problem that causes 
user difficulties that remains undiscovered in usabil- 
ity inspection. 

Thoroughness: A measure for assessing us- 
ability inspection methods. Determined by dividing 
the number real problems found by the UIM by the 
number of known problems. 

True Negative: A potential usability problem 
discovered in a usability inspection that upon analy- 
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sis is correctly eliminated by the analyst as an improb- 
able problem. The discovered problem is confirmed in 
real use as causing the user no difficulties. 



Usability Inspection Methods (UIM): The 

term given to a variety of analytical methods for 
predicting usability problems in designs. 



u 



True Positive: A prediction of a usability prob- 
lem reported in a usability inspection that is proven 
to be a real problem in actual use with real users. 



Validity: A measure for assessing usability in- 
spection methods. Determined by dividing the num- 
ber of real problems found by the UIM by the 
number of problems predicted by the UIM. 
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INTRODUCTION 

Cognitive load theory (CLT) is currently the most 
prominent cognitive theory pertaining to instruc- 
tional design and is referred to in numerous empirical 
articles in the educational literature (for example, 
Briinken, Plass, & Leutner, 2003; Chandler & 
Sweller, 1991; Paas, Tuovinen, Tabbers, & Van 
Gerven, 2003; Sweller, van Merrqnboer, & Paas, 
1998). CLT was developed to assist educators in 
designing optimal presentations of information to 
encourage learning. CLT has also been extended 
and applied to the design of educational hypermedia 
and multimedia (Mayer & Moreno, 2003). The theory 
is built around the idea that the human cognitive 
architecture has inherent limitations related to ca- 
pacity, in particular, the limitations of human work- 
ing memory. As Sweller et al. (pp. 252-253) state: 

The implications of working memory limitations 
on instructional design cannot be overstated. All 
conscious cognitive activity learners engage in 
occurs in a structure whose limitations seem to 
preclude all but the most basic processes. 
Anything beyond the simplest cognitive activities 
appear to overwhelm working memory. Prima 
facie, any instructional design that flouts or 
merely ignores working memory limitations 
inevitably is deficient. It is this factor that provides 
a central claim to cognitive load theory. 

In order to understand the full implications of 
cognitive load theory, an overview of the human 
memory system is necessary. 



BACKGROUND 

The Human Memory System: 

The Modal Model of Memory 

It has long been accepted that the human memory 
system is made up of two storage units: long-term 
memory and working memory. There is an abun- 
dance of behavioral (for example, Deese & Kaufman, 
1957; Postmand & Phillips, 1965) and neurological 
evidence (Milner, Corkin, & Tueber, 1968; 
Warrington & Shallice, 1969) to support this theory. 
Long-term memory is a repository for information 
and knowledge that we have been exposed to repeti- 
tively or that has sufficient meaning to us. Long-term 
memory is a memory store that has an indefinable 
duration but is not conscious; that is, any information 
in long-term memory must first be retrieved into 
working memory for us to be aware of it. Hence, any 
conscious manipulation of information or intentional 
thinking can only occur when this information is 
available to working memory. The depth and dura- 
tion of processing in working memory determines 
whether information is passed on to long-term 
memory. Once knowledge is stored in long-term 
memory, we can say that enduring learning has 
occurred. 

Working Memory Limitations 

Unfortunately, working memory has some very defi- 
nite limitations. First, there is a limit of volume. 
Baddeley, Thomson, and Buchanan (1975) reported 
that the size of working memory is equal to the 
amount of information that can be verbally re- 
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hearsed in approximately 2 seconds. A second limi- 
tation of working memory concerns time. When 
information is attended to and enters working 
memory, if it is not consciously processed, it will 
decay in approximately 20 seconds. 



CLT AND EDUCATIONAL 
HYPERMEDIA 

The modal model of human memory, specifically 
these limitations of working memory, is the basis for 
CLT. A version of CLT, Mayer and Moreno’s 
(2003) selecting-organizing-integrating theory of 
active learning, is specifically targeted to learning in 
hypermedia environments. The theory is built upon 
three core assumptions from the modal model of 
memory: the dual channel assumption, the limited 
capacity assumption, and the active processing as- 
sumption. The dual channel assumption is based on 
the notion that working memory has two sensory 
channels, each responsible for processing different 
types of input. The auditory or verbal channel pro- 
cesses written and spoken language. The visual 
channel processes images. The limited capacity 
assumption applies to these two channels; that is, 
each of these channels has a limit as to the amount 
of information that can be processed at one time. 
The active processing assumption is derived from 
Wittrock’s (1989) generative learning theory and 
asserts that substantial intentional processing is re- 
quired for meaningful learning. With these assump- 
tions as a foundation, Mayer and Moreno have 
focused on three key mental activities that can place 
demands on available cognitive resources : attention, 
mental organization, and integration. 

Improving Working Memory Capacity 
Directly 

How does CLT advocate improving working memory 
limitations? To date, the solution for reducing cogni- 
tive load has focused on directly reducing the de- 
mands on working memory. Mayer and Moreno 
(2003) outline a number of methods for reducing 
cognitive load in hypermedia: (a) Resting on the dual 
channel assumption, cognitive load on one channel 
can be relieved by spreading information across both 
modalities, that is, by providing information in both a 



visual and auditory format, (b) presenting material in 
segments and providing pretraining on some material 
can reduce overload, (c) the redundancy of informa- 
tion can be eliminated, and (d) visual and auditory 
information can be synchronized. 

Mayer and Moreno (2003) also refer to “inciden- 
tal processing” as “cognitive processes that are not 
required for making sense of the presented material 
but are primed by the design of the learning task” (p. 
45). Incidental processing is considered undesirable 
as it relates to the cognitive resources that are 
needed to process extraneous, irrelevant material 
that may be included on the presentation. Mayer and 
Moreno advocate weeding out this extraneous ma- 
terial to reduce cognitive load. 



u 



Measuring Cognitive Load 



If the premise of cognitive load theory is correct, 
then certainly a primary activity in designing instruc- 
tional materials must be the meaningful measure- 
ment of cognitive load. This is not a simple task as 
the method of measurement is dependent on the 
constructs that different researchers use to describe 
cognitive load. For example, Paas et al. (2003) 
propose that three constructs define cognitive load: 
mental load, which reflects the interaction between 
task and subject characteristics; mental effort, which 
reflects the actual cognitive reserves that are ex- 
pended on the task; and performance, which can be 
defined as the learner’s achievements. Previous 
research in cognitive load measurement has relied 
on three types of measures to assess the cognitive 
load of the user: (a) physiological measures such as 
heart rate and pupillary responses, (b) performance 
data on primary and secondary tasks, and (c) self- 
reported ratings (Paas et al.). These tasks have been 
used in various configurations to measure overall 
cognitive load (Briinken et al., 2003; Chandler & 
Sweller, 1996; Gimino, 2002; Paas, 1992). To date, 
most efforts to measure cognitive load have focused 
on self-reported ratings (see Paas et al.). 



FUTURE TRENDS 

Our ability to reduce cognitive load in educational 
hypermedia rests on our thorough definition of the 
underlying constructs of cognitive load as well as the 
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design of test mechanisms that allow us to measure 
cognitive load and detect situations where cognitive 
resources are overtaxed. Future research directed at 
these two issues will contribute to the explanatory 
power of the theory and allow us to apply these 
theoretical principles to educational settings that make 
use of hypermedia materials. 

CONCLUSION 

The cognitive load theory for educational hypermedia 
has emerged as a prominent theory for guiding in- 
structional designers in the creation of educational 
hypermedia. It is based on the modal model of human 
memory, which posits that there are limits to the 
working memory store that impact the amount of 
cognitive effort that can be expended on a given task. 
When available cognitive resources are surpassed, 
performance on memory and learning tasks is de- 
graded, a condition referred to as cognitive overload. 
CLT for educational hypermedia advocates that edu- 
cational materials must be designed that take into 
account these limitations. In order to do this, two 
obstacles to using CLT to its full advantage must be 
resolved: (a) the diversity of the descriptions of its 
underlying constructs and (b) the lack of valid and 
reliable methods for the measurement of cognitive 
load. 
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Limited Capacity Assumption: The limited 
capacity assumption applies to the dual channels of 
verbal and auditory processing. The assumption is 
that each of these channels has a limit as to the 
amount of information that can be processed at one 
time. 



u 



Long-Term Memory: Long-term memory is a 
repository for information and knowledge that we 
have been exposed to repetitively or that has suffi- 
cient meaning to us. Long-term memory is a memory 
store that has an indefinable duration but is not 
conscious; that is, any information in long-term 
memory must first be retrieved into working memory 
for us to be aware of it. 



Warrington, E. K., & Shallice, T. (1969). The elec- 
tive impairments of auditory verbal short-term 
memory. Brain, 92, 885-896. 

Wittrock, M.C. (1989). Generative processes of 
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KEY TERMS 

Active Processing Assumption: The active 
processing assumption asserts that intentional and 
significant mental processing of information must 
occur for enduring and meaningful learning to take 
place. 

Cognitive Load Theory: Cognitive Load 
Theory asserts that the capacities and limitations of 
the human memory system must be taken into ac- 
count during the process of instructional design in 
order to produce optimal learning materials and 
environments. 

Dual Channel Assumption: The dual channel 
assumption is based on the notion that working 
memory has two sensory channels, each responsible 
for processing different types of input. The auditory 
or verbal channel processes written and spoken 
language. The visual channel processes images. 



Mental Effort: A second construct related to 
measuring cognitive load, “mental effort is the as- 
pect of cognitive capacity that is actually allocated to 
accommodate the demands imposed by the task” 
(Paas et al., 2003, pp. 64). 

Mental Load: One of three constructs devised 
by Paas et al. (2003) to assist in the measurement of 
cognitive load. Mental load reflects the interaction 
between task and subject characteristics. Accord- 
ing to Paas et al. (2003), “ it provides an indication of 
the expected cognitive capacity demands and can be 
considered an a priori estimate of cognitive load”(pp. 
64). 

Performance: Performance is the third con- 
struct in Paas et al.’s (2003) definition of cognitive 
load and is reflected in the learner’s measured 
achievement. Aspects of performance are speed of 
completing a task, number of correct answers and 
number of errors. 

Short-Term or Working Memory: Short-Term 
or Working Memory refers to a type of memory 
store where conscious mental processing occurs, 
that is, thinking. Short-term memory has a limited 
capacity and can be overwhelmed by too much 
information. 
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INTRODUCTION 

Numerous technical, cognitive, social, and organiza- 
tional constraints and biases can reduce the quality 
of usability data, preventing optimal responses to a 
system’s usability deficiencies. Detecting and ap- 
propriately responding to a system’s usability defi- 
ciencies requires powerful collection methods and 
tools, skilled analysts, and successful interaction 
amongst usability specialists, developers, and other 
stakeholders in applying available resources to pro- 
ducing an improved system design. The detection of 
usability deficiencies is largely a matter of analyzing 
a system’s characteristics and observing its perfor- 
mance in use. Appropriate response involves the 
translation of collected data into usability problem 
descriptions, the production of potential design solu- 
tions, and the prioritization of these solutions to 
account for pressures orthogonal to usability im- 
provements. These activities are constrained by the 
effectiveness and availability of methods, tools, and 
organizational support for user-centered design pro- 
cesses. The quality of data used to inform system 
design can, for example, be limited by a collection 
tool’ s ability to record user and system performance, 
an end user’s ability to accurately recall past inter- 
actions with a system, an analyst’s ability to per- 
suade developers to implement changes, and an 
organization’ s commitment to devoting resources to 
user-centered design processes. 

The remainder of this article (a) briefly reviews 
basic usability concepts, (b) discusses common bar- 
riers to successfully collecting, analyzing, and react- 
ing to usability data, and (c) suggests future trends in 
usability research. 

BACKGROUND 

Usability barriers hinder data collection processes, 
reduce the quality of usability data, and therefore 
hinder the detection of and response to a system’s 



deficiencies. Barriers to system usability are neces- 
sarily barriers to one or more dimensions of usability. 
Usability dimensions are commonly taken to include 
at least user efficiency, effectiveness, and subjec- 
tive satisfaction with a system in performing a 
specified task in a specified context (ISO 9241-1 1, 
1998), and frequently also include system memora- 
bility and learnability (Nielsen, 1993). 

Usability data are defined by Hilbert and Redmiles 
(2000) as any information used to measure or iden- 
tify factors affecting the usability of a system being 
evaluated. Such data are collected via usability 
evaluation methods (UEMs), methods or tech- 
niques that can assign values to usability dimensions 
(J. Karat, 1997) and/or indicate usability deficien- 
cies in a system (Hartson, Andre, &Williges, 2003). 
Usability evaluation may be analytic (based on inter- 
face design attributes, independent of actual usage) 
or empirical (based on observations of system per- 
formance in actual use; Hix & Hartson, 1993), and 
may be formative (employed during system develop- 
ment) or summative (employed after system deploy- 
ment; Scriven, 1967). 

Usability data quality refers to the extent to 
which the data efficiently and effectively predicts 
system usability in actual usage, can be efficiently 
and effectively analyzed, and can be efficiently and 
effectively reacted to. High-quality usability data 
indicate real system deficiencies (validity) that will 
be repeatedly encountered by individual users (reli- 
ability) and by a wide range of users (representa- 
tiveness); represent deficiencies in their entirety 
(completeness) ; can be easily translated by usability 
analysts into problem descriptions that accurately 
represent the underlying deficiencies (communica- 
tive effectiveness and efficiency); indicate prob- 
lems that seriously influence the quality of users’ 
experiences with the system (severity); and per- 
suade developers and other stakeholders to imple- 
ment design changes (downstream utility) that 
verifiably improve system usability (impact) at low 
cost (cost effectiveness). (For a discussion of each 
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of these dimensions, see the article titled “Usability 
Data Quality” in this encyclopedia.) 

BARRIERS TO USABILITY-DATA 
QUALITY 

The successful collection, analysis, and reaction to 
usability data are hindered in practice by numerous 
constraints and biases. Far more empirical work 
identifying barriers to data quality has focused on 
collection than analysis and reaction for the obvious 
reasons: Collection processes are more amenable to 
experimental control and more accessible to re- 
searchers (i.e. , easier to simulate or observe in 
entirety). Nonetheless, in recent years, barriers 
throughout the development process have been iden- 
tified, as discussed in this section. 

Resource Constraints 

If representative customers and end users are dis- 
tributed (especially internationally), costs become 
the primary barrier to (empirical) collection, which 
will tend to drive the selection of methods (Englefield, 
2003; Stanton & Baber, 1996; Vasalou, Ng, Wiemer- 
Hastings, & Oshlyansky, 2004) and affect data 
quality. As a result, informal data-collection meth- 
ods are more frequently employed in practice than 
formal methods (Vredenberg, Mao, Smith, & Carey, 
2002 ). 

Perhaps the most common constraint arises from 
the timing of data collection in the development 
cycle. Not surprisingly, the general finding is that the 
later usability data are collected, the less likely they 
are to result in design changes (Bias & Mayhew, 
1994). This problem can be exacerbated when a 
short development cycle is demanded by concerns 
orthogonal to usability. 

When data collection is performed at low cost 
(for example, by using nonintrusive remote collec- 
tion methods), the resource burden is often not 
avoided but rather shifted to analysis since such 
methods can result in more data than are possible to 
translate into problem descriptions within the devel- 
opment cycle. 



User Ability and Motivation 

One of the most widely employed collection meth- 
ods, think-aloud usability testing, requires users to 
engage in a highly unnatural activity, namely, ver- 
bally unloading a stream of consciousness while 
interacting with a system (Nielsen, 1993). Lin, 
Choong, and Salvendy (1997) point out that many 
users have difficulty in keeping cognitive processes 
verbalized while performing tasks, and that expert 
users in particular find it difficult to verbalize their 
(often automatic) processes. When activities are 
routine or would not normally require attention, 
concurrent verbalization is not only difficult, but can 
affect cognitive processes (Birns, Joffre, Leclerc, & 
Paulsen, 2002; Ericsson & Simon, 1980) and there- 
fore hinder the validity of behavioral observations 
made during testing. 

Remote methods in which the setting of data 
collection is more realistic do not avoid these barri- 
ers. Fundamentally, data collection is limited by the 
ease of use of the collection instrument (Flartson & 
Castillo, 1998) and users’ ability to notice usability 
problems as they occur (Galdes & Halgren, 2001), 
ability to evaluate incomplete prototypes with miss- 
ing functionality, ability to remember and articulate 
the context of a previously encountered problem (J. 
Karat, 1997), and willingness to accept the cost of 
providing feedback. 

Selective Feedback and Feedback Bias 

Under many circumstances, usability data that could 
drive system improvements are simply never col- 
lected. Even when mechanisms are in place for 
reporting critical incidents during actual use, users 
will choose which problems to report, often neglect- 
ing those they deem unimportant (Costabile, 2001). 
Neglecting low-severity problems can in some cases 
be a benefit to data quality, but only to the extent that 
users are able to recognize which problems recur 
and to tune their feedback activities effectively. 
Users conversely will often neglect reporting high- 
severity problems, naturally in favor of focusing 
their attention on correcting such problems and 
getting their work done. 

Neglecting feedback altogether may in some 
cases be the lesser of two feedback evils, the other 
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being speculation. Hilbert and Redmiles (1999) have 
noted speculative feedback from both novices and 
experts that can affect data quality. Novices will in 
some cases speculate that a real usability problem 
would not interfere with expert usage, and conse- 
quently neglect to report serious system learnability 
deficiencies, while experts will incorrectly speculate 
on the usability of interface features for novice users 
rather than focus on real problems they encounter. 
User assumptions about the usefulness of their own 
feedback is a particularly difficult problem to tackle 
in part because such assumptions may not be articu- 
lated in the context of data collection; when custom- 
ers filter feedback to indicate only those problems 
they foresee as feasibly (and quickly) being ad- 
dressed by a software vendor, for example, such 
filtering is unlikely to be made explicit without prompt- 
ing. 

Indirect Data 

A key effect of attempting to maximize the cost 
effectiveness of data collection is an increased reli- 
ance on indirect sources of usability data during 
formative evaluation stages. Despite its shortcom- 
ings, face-to-face usability testing has the advantage 
of allowing analysts to view problems as they occur 
and to clarify end-user comments and reasoning 
while interacting with a prototype or live system. 
When cost constraints require remote collection, the 
resulting data frequently lack context (Hilbert & 
Redmiles, 1998). Moreover, users will frequently 
delay providing usability feedback until well after an 
important incident has occurred (Hartson & Castillo, 
1998). Birns et al. (2002) describe instances of users 
encountering usability problems only to later blame 
themselves for the problem after subsequent interac- 
tions with the system. Thus, the goal of collecting 
data uninfluenced by such factors can be compro- 
mised. Not only are indirect data often influenced by 
subsequent interactions, but they tend to more heavily 
focus on users’ subjective preferences rather than 
objective descriptions of problems. Were user pref- 
erences consistent predictors of performance defi- 
ciencies, data quality would be unaffected, but fre- 
quently they are not (Frpkjrer, Hertzum, & Hornbask, 
2000; Nielsen & Levy, 1994). 

Preference feedback and descriptions of prob- 
lems from memory may not directly represent en- 



countered usability deficiencies, but they are still at 
least “from the horse’ s mouth.” Another key effect 
of cost constraints is a reliance on filtered usability 
feedback from sales professionals or from custom- 
ers making the buying decisions (who may have 
varying levels of engagement with end users) rather 
than directly from the end users themselves. The 
quality of such filtered data remains largely 
uninvestigated. 

A similar filtering that has been investigated is 
data collection via analytic methods. Such methods 
often explicitly require usability specialists to take 
on the role of the end user. Hertzum and Jacobsen 
(2001) distinguish between two types of barriers to 
data quality in these cases: (a) anchoring, in which 
a system is evaluated with respect to users too 
similar to the evaluator to be representative of the 
user population, and (b) stereotyping, in which the 
system is evaluated with respect to a homogenous 
catchall user not accounting for a wide-enough 
range of user characteristics. Not surprisingly, such 
biases and differences in technique lead to differ- 
ences in the results of analytic methods applied by 
different evaluators (Andre, Hartson, Belz, & 
McCreary, 2001; Cockton, Woolrych, Hall, & 
Hindmarch, 2003); more problems tend to be dis- 
covered by a team of evaluators specializing their 
focus (i.e., looking for only certain types of prob- 
lems in the interface; Zhang, Basili, & 
Schneiderman, 1999). As a result, multiple evalua- 
tors are commonly recommended in employing ana- 
lytic methods. 

Method Scope 

A shift toward preference data for particular col- 
lection methods is one example of limited method 
scope; the types of problems typically indicated by 
different collection methods can vary (John & 
Kieras, 1996), one of the primary reasons they 
traditionally supplement one another in user-cen- 
tered design processes. Englefield (2003) distin- 
guishes between the breadth of a usability method’ s 
data collection capabilities (similar to method thor- 
oughness, or the extent to which it is capable of 
detecting all usability deficiencies) and its sensitiv- 
ity to particular types of problems, claiming for 
example that empirical methods tend to be sensitive 
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to sociotechnical design problems that expert in- 
spections have difficulty identifying. 

An important consequence is that methods vary 
in their appropriateness at different stages in the 
development cycle (Lewis & Wharton, 1 997 ; Rubin, 
1994) and for different types of prototypes or sys- 
tems; the fidelity of the prototype used can have 
subtle effects on the sensitivity of data collection 
(Virzi, Sokolov, & Karis, 1996). Coordinating the 
application of multiple collection methods is a matter 
of achieving optimal scope so as not to focus too 
heavily on particular types of problems at the ex- 
pense of others, for example, by devoting too many 
resources to analytic methods such as cognitive 
walk-throughs to address system learnability, but 
doing so at the expense of other usability dimensions. 

Stimulus and Simulation Effects 

Ecological validity is of primary concern for usability 
data collection given the goal of discovering defi- 
ciencies that occur for real users in their actual 
working environments. Remote collection allows 
end users to evaluate systems under more realistic 
social and technical constraints (Krauss, 2003), but 
in many cases, the user is nonetheless evaluating an 
incomplete prototype. Particularly in the application 
of empirical methods, usability specialists have long 
been aware of subtle and unintended effects of 
prototypes on the quality of the data they collect. 
Usability engineering work frequently involves the 
assessment of systems that range in fidelity from 
fully functional products to paper prototypes and 
sketches. Matching fidelity attributes to data collec- 
tion goals is often a difficult balancing act. Have too 
high fidelity, and an end user may focus on color 
schemes or branding logos that were never intended 
to represent the definite final product. Have too low 
fidelity, and normally attention-consuming aspects 
of the interface may fail to have their realistic 
impact. Have the product be too vertical (deep 
functionality for only a few features or tasks), and 
the user may lose focus when attempting to explore 
nonfunctioning areas of the prototype. Have it be too 
horizontal (shallow functionality across many fea- 
tures and tasks), and collected data may be of little 
use in driving design decisions. 

A more subtle difficulty occurs in producing 
mock data for prototypes and simulated usability 



tests (Kantner, Sova, & Rosenbaum, 2003). Both 
the realism and the credibility of the test itself can 
hinge on this activity, particularly if the tasks of 
interest to the evaluator are focused on information 
gathering and usage. The realism of such mock data 
can be difficult to achieve in part because prototype 
designers are typically not subject-matter experts in 
the system’s domain and are often still learning 
about the domain during development. 

Finally, evaluators often must accept the practi- 
cal limits of task simulation. Tasks that span days or 
even weeks, for example, require the application of 
less direct methods to gather data (Galdes & Halgren, 
2001). Systems for which tasks frequently involve 
safety risks similarly present simulation difficulties. 
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Some of the advantages of having an expert evalu- 
ator present to interact with end users during empiri- 
cal data collection have been previously mentioned, 
such as the ability to clarify user comments and 
reasoning. However, numerous aspects of evaluator 
intervention affect usability data, including the amount 
of such intervention (Held & Biers, 1992), the type 
of observation and type of evaluator prompts (such 
as leading questions and task guidance; Galdes & 
Halgren, 2001; Kjeldskov & Skov, 2003), and the 
presence of recording devices (Nielsen, 1993). Boren 
and Ramey (2000) investigated the usage of verbal 
protocols in usability testing, finding widespread 
inconsistencies in the method’s application. These 
effects are perhaps impossible to avoid altogether; 
as C. Karat (1994) puts it, collection methods act as 
“filters” on user-system interactions. 

The effects of evaluator differences have largely 
been investigated by noting a substantial lack of 
overlap between usability problem sets produced by 
multiple evaluators (Hertzum, Jacobsen, & Molich, 
2002; Jacobsen, Hertzum, & John, 1998; Molich, 
Ede, Kaasgaard, & Karyukin, 2004). These effects 
cannot be avoided when the testing work must be 
divided for practical reasons, for example, when a 
system has an international user base, and cultural 
and language barriers require the expertise of mul- 
tiple team members (Vasalou et al., 2004). The clear 
implication is that usability tests conducted by differ- 
ent evaluators lack reliability and consistency. How- 
ever, because such consistency is not the only goal 
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of usability data collection, these effects can turn out 
to be a blessing. The secondary effect of inconsis- 
tent think-aloud techniques is an increase in the 
scope and sensitivity of the method; similar to ana- 
lytic methods, optimal empirical data collection for 
usability purposes is likely a team effort. 

Analyst Ability and Analytical Bias 

Placing usability specialists with varying backgrounds 
and perspectives in the picture affects not only the 
data resulting from evaluation methods, but the 
interpretation of that data and resulting work prod- 
ucts in the development cycle. Analyst skills come 
into play in two important areas : (a) the organization 
and prioritization of usability deficiencies and poten- 
tial solutions, and (b) the reporting of these conclu- 
sions to development. 

Determining problem severity is partially a judg- 
ment call. Jacobsen et al. (1998) point out that the 
criteria employed by different analysts in estimating 
severity can vary, and that such estimates may be 
biased when analysts judge problems that they them- 
selves observed. On the other hand, severity judg- 
ments by analysts who did not observe the problem 
are limited by a lack of information about the details 
and context of the deficiency. Here again, an indi- 
vidual analyst is probably nonoptimal. Similarly, ana- 
lyst expertise is critical in the identification of under- 
lying causes of usability problems and their transla- 
tion into design solutions (John & Marks, 1997). 
Such expertise is also important in the recognition of 
similar problems previously encountered and appli- 
cable solution patterns. Expertise is also important in 
the successful prioritization of problems, which is 
typically the result of considering estimated problem 
severity and representativeness (Gediga, Hamborg, 
& Diintsch, 1999) as well as likely implementation 
costs. 

Once problems and potential solutions have been 
identified, analysts are responsible for effectively 
reporting a plan of action for improving system 
usability. This activity is of course set against a 
background of a project commitment and incentives 
(or lack thereof) for achieving usability goals, and a 
previous relationship between usability and develop- 
ment teams (Bias & Mayhew, 1994; Mirel, 2000), 
but additionally depends on analyst ability and stan- 
dardized reporting (Bevan, 1998). Andre et al. (200 1 ) 



argue, however, that even armed with standard 
reports, descriptions of design solutions are often 
vague or incomplete, and there is inevitable loss of 
information as developers interpret these documents. 
Analysts must pick the right set of problems to 
present as too large a set may hinder persuasiveness 
(Dumas & Redish, 1993). They must also effec- 
tively leverage face-to-face meetings with develop- 
ment (Galdes & Halgren, 2001). 

Development Conflicts 

Solutions to usability deficiencies can be at odds with 
one another, forcing a reliance on prioritization, but 
even the highest priority and most persuasive data 
can be thwarted by concerns orthogonal to usability 
improvements. Some of these concerns are simply 
outside development control, such as imposed cor- 
porate standards (Hertzum, 1999). Others require 
trade-off decisions, such as in considering legacy 
concerns, whether potential solutions to usability 
problems may conflict with other aspects of soft- 
ware quality, and the architectural changes needed 
to fix high-severity problems (Folmer & Bosch, 
2004). The timing of data collection is again critical 
since addressing usability concerns late in the cycle 
is likely to increase development costs (Folmer & 
Bosch, 2003). 

Process Bias 

Conceptually, there are two types of potential pro- 
cess biases than can hinder usability data impact: (a) 
if a type of data is cost effective and tends to achieve 
high impact when reacted to, but nonetheless tends 
not to be persuasive in the eyes of developers and 
other stakeholders (“untapped potential”), and (b) if 
a type of data tends not to have impact but has high 
persuasiveness (“false prophet”). For example, sup- 
pose a development team highly respects one type of 
data (say, empirical usability test data) and tends to 
focus on implementing changes suggested by it at 
the expense of other types of data, such as heuristic 
reviews. If it turns out that heuristic reviews pro- 
duce cost-effective, high-impact data, or if the us- 
ability testing data tends to be of low quality, impact 
on system usability suffers from a process bias. The 
first type (high impact and cost effectiveness, but 
low persuasiveness) only indicates a potential bias 
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since it is at least possible that downstream stake- 
holders will tune responses to data from a particular 
method based on an effective understanding of 
when it will likely have impact. 



FUTURE TRENDS 



short time. A similar problem arises with any rela- 
tively new product, for which power users simply do 
not yet exist. How does one most effectively design 
for nonexistent experts? How are usability concerns 
most effectively balanced to meet current needs but 
not create usability legacy problems? 



u 



The range of factors influencing usability data’s 
likelihood of impacting system design, in conj unction 
with concerns orthogonal to system usability, leave 
many challenges open for usability research. This 
section identifies two general emerging research 
areas important to usability theory and practice. 

Balancing Data Collection Needs 
and Effects 

The attractiveness of data with high communicative 
effectiveness leads many evaluators to value users’ 
design ideas and rationalizations about their behav- 
iors, but the validity of such data is a serious concern. 
In practice, this problem can be exacerbated by the 
persuasiveness and appeal of design sketches straight 
from end users. The appeal is certainly understand- 
able. When limited resources are spent visiting a 
small set of customers or end users, there is an 
organizational stake in ensuring that data at a pre- 
mium is put to good use; from a customer relation- 
ship perspective, there is, perhaps more importantly, 
value in (directly) demonstrating that customer feed- 
back and ideas are listened to. A valuable direction 
for usability science may be to map out the problems 
and potentials of balancing usability data quality 
concerns with needs that are created by the simple 
act of engaging customers and end users during 
development. 

Balancing the Present and Future 

New companies often have small sets of early 
adopters who are the primary sources for evaluating 
systems in actual use. Responding to their needs is 
critical, and not surprisingly, they have significant 
impact on the early design of the product. However, 
effective response to high-quality data from these 
customers now may create an unavoidable hole 
later; who representative users are in the initial 
stages of a system can change dramatically in a 



CONCLUSION 

User-centered design processes are subject to a 
number of basic constraints limiting the quality of 
usability data collected within these processes. The 
inherent limitations of widely employed methods and 
tools hinder the successful collection of high-quality 
usability data. End users, analysts, developers, and 
other stakeholders frequently introduce biases and 
other unintended effects into data collection, analy- 
sis, and development processes. Such effects often 
inevitably result in less-than-optimal responses to a 
system’s deficiencies, but can in some cases in- 
crease the scope and sensitivity of data collection. 
Barriers to usability data quality can additionally 
arise from concerns orthogonal to system usability, 
or from factors not easily predicted during system 
development. 
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KEY TERMS 

Anchoring: An evaluator bias in analytic meth- 
ods in which a system is evaluated with respect to 
users too similar to the evaluator to be representa- 
tive of the user population. 

Method Breadth: Extent to which a usability- 
evaluation method is capable of detecting all of a 
system’s usability deficiencies. 

Method Sensitivity: Extent to which a usabil- 
ity-evaluation method is capable of detecting a par- 
ticular type of usability deficiency. 



Stereotyping: An evaluator bias in analytic 
methods in which a system is evaluated with respect 
to a homogenous catchall user not accounting for a 
wide-enough range of user characteristics. 

Usability Barrier: Technical, cognitive, social, 
or organizational constraint or bias that decreases 
usability-data quality, consequently hindering the 
optimal detection of, and response to, a system’s 
usability deficiencies. 

Usability Data: Any information used to mea- 
sure or identify factors affecting the usability of a 
system being evaluated. 

Usability Data Quality: Extent to which usabil- 
ity data efficiently and effectively predict system 
usability in actual usage, can be efficiently and 
effectively analyzed, and can be efficiently and 
effectively reacted to. 

Usability Evaluation Method (UEM): Method 
or technique that can assign values to usability 
dimensions and/or indicate usability deficiencies in a 
system. 
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INTRODUCTION 

A substantial portion of usability work involves the 
coordinated collection of data by a team of special- 
ists with varied backgrounds, employing multiple 
collection methods, and observing users with a wide 
range of skills, work contexts, goals, and responsi- 
bilities. The desired result is an improved system 
design, and the means to that end are the successful 
detection of, and reaction to, real deficiencies in 
system usability that severely impact the quality of 
experience for a range of users. 

In the context of user-centered design processes, 
valid and reliable data from a representative user 
sample is simply not enough. High-quality usability 
data is not just representative of reality. It is useful. 
It is persuasive in the eyes of the right stakeholders. 
It results in verifiable improvements to the system 
for which it is intended to represent a deficiency. 
The data must be efficiently and effectively trans- 
lated into development action items with appropriate 
priority levels, and it must result in effective work 
products downstream, leading to cost-effective de- 
sign changes. 

The remainder of this article (a) briefly reviews 
basic usability data collection concepts, (b) exam- 
ines the dimensions that make up high-quality usabil- 
ity data, and (c) suggests future trends in usability 
data quality research. 

BACKGROUND 

Usability data are critical to the successful design of 
systems intended for human use, and are defined by 
Hilbert and Redmiles (2000) as any information used 
to measure or identify factors affecting the usability 
of a system being evaluated. Such data are collected 
via usability evaluation methods (UEMs), meth- 
ods or techniques that can assign values to usability 
dimensions (J. Karat, 1997) and/or indicate usability 



deficiencies in a system (Hartson, Andre, & Williges, 
2003). Usability dimensions are commonly taken to 
include at least user efficiency, effectiveness, and 
subjective satisfaction with a system in performing a 
specified task in a specified context (ISO 9241-1 1, 
1998), and frequently also include system memora- 
bility and learnability (Nielsen, 1993a). 

Usability data are collected using either analytic 
methods, in which the system is evaluated based on 
its interface design attributes (typically by a usability 
expert), or empirical methods, in which the system 
is evaluated based on observed performance in 
actual use (Hix & Hartson, 1993). In formative 
evaluation, data are collected during the develop- 
ment of a system in order to guide iterative design. 
In summative evaluation, data are collected to 
evaluate a completed system in use (Scriven, 1967). 
Usability data have been classified in numerous 
other models and frameworks frequently focusing 
on the procedure for producing the data (including 
the resources expended and the level of the formal- 
ity of the method), the (relative) physical location of 
the people and artifacts involved, the nature and 
fidelity of the artifact being evaluated, and the goal 
of the collection process. 

DIMENSIONS OF USABILITY DATA 
QUALITY 

Usability-data quality refers to the extent to which 
usability data efficiently and effectively (a) predict 
system usability in actual usage (validity, reliability, 
representativeness, and completeness), (b) can be 
analyzed (communicative effectiveness and effi- 
ciency, and analyst estimates of severity), and (c) 
can be reacted to (downstream utility, impact, and 
cost effectiveness). This section discusses the di- 
mensions of usability data quality and their assess- 
ment. 
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Validity 

High-quality usability data are predictive of a real 
deficiency in one or more usability attributes for a 
given system. End-user behavior and comments 
may be perfectly unbiased or unaffected by the 
collection process, yet still lack validity from the 
perspective of usability science. Strict performance 
measures (such as time on task) may be viewed as 
lacking validity primarily because they often fail to, 
on their own, demonstrate an underlying problem 
(Gediga, Hamborg, & Duntsch, 2002). Qualitative 
data more often do point directly to a deficiency, but 
if a user comments on a system feature that will 
never be used, for example, the comment may truly 
reflect the user’s attitudes but nonetheless lack 
validity. 

Verifying usability data validity lies in comparing 
the data’s predicted problems to the actual system 
performance in use (John & Marks, 1997 ; Nielsen & 
Phillips, 1993). In practice, assessing validity is 
nontrivial for three fundamental reasons. First, there 
is not widespread agreement on how to operationalize 
ultimate usability criteria into actual criteria (Gray & 
Salzman, 1998; Hartson et al., 2003); that is, agree- 
ing on standard measures (and measurement proce- 
dures) for the underlying dimensions of usability 
itself is a long-standing difficulty. Second, observing 
the system in use and recording deficiencies is itself 
a usability data collection process, and thus at best 
the actual criterion is subject to possible validity 
concerns of its own. While these first two problems 
are by no means unique to usability research, they 
illustrate the difficulty in assessing usability data 
quality without a widely agreed upon method for 
identifying what will be accepted as the system’s 
real deficiencies. Finally, individual pieces of usabil- 
ity data are often difficult to translate into underlying 
problems, and this step is necessary if validity is to be 
assessed. 

To make the problem slightly more tractable, 
researchers have by and large elected to evaluate 
validity using usability testing as a benchmark for 
comparison, as it is assumed to most closely reflect 
system performance in use (Cuomo & Bowen, 1 994; 
Desurvire, 1994; Jacobsen, Hertzum, & John, 1998). 
There are of course potential problems with this 
approach as usability testing has at least ecological 
validity concerns (Thomas & Kellogg, 1989). In- 



deed, this problem generally makes the literature 
comparing UEM effectiveness difficult to interpret 
(Gediga et al., 2002; Gray & Salzman, 1998). Ideally, 
a standard method is applied to assessing live system 
performance, producing a usability problem set. 
Validity is then assessed by comparing the problem 
set produced by a UEM to the standard set (Sears, 
1997). 

Reliability and Representativeness 

High-quality usability data not only indicate real 
problems, but indicate problems that will be repeat- 
edly encountered by individual users (reliable) and 
by a wide range of users (representative). As with 
many disciplines, data collected for usability pur- 
poses vary in the extent to which the repeated 
exposure to a problem is a good predictor of validity. 
While subjective satisfaction ratings that vary one 
day to the next put validity in question, encountering 
only occasional difficulty in executing a system 
action or completing a task, for example, does not 
since user errors indicating real interface problems 
commonly vary in frequency of occurrence. Unlike 
research in many other disciplines, representative- 
ness across participants is not simply a question to be 
investigated, but a contributor to problem impor- 
tance and therefore a dimension of data quality. 

Measuring reliability and representativeness is a 
matter of identifying the recurrence of specific 
problems (Jeffries, Miller, Wharton, &Uyeda, 1991). 
Such measurement is nontrivial because problem 
reports may differ in verbiage but still indicate the 
same underlying problem, or conversely may be 
similar in their qualitative descriptions but indicate 
different deficiencies (Andre, Hartson, Belz, & 
McCreary, 2001; Hartson et al., 2003). 

Completeness 

High-quality usability data represent usability prob- 
lems in their entirety. One of the critical difficulties 
in analyzing pure behavioral data is their lack of 
contextual information about the user’ s current task, 
attention level, and cognitive processes while a 
problem takes place (Hilbert & Redmiles, 1999); 
another problem is their flood of extraneous data that 
are not useful in evaluating the deficiency (Hartson 
& Castillo, 1998). Ideal usability data predict a 
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system deficiency without requiring analysts to fill in 
the blanks and without noise. Like validity, complete- 
ness cannot be properly assessed without standard 
problem descriptions for comparison. 

Researchers have instead shown interest in as- 
sessing analogous concepts at the UEM level, namely, 
the thoroughness of a method (the extent to which it 
uncovers all known usability problems, often with 
usability testing as the benchmark; Sears, 1997) and 
false positives, noting predicted problems that lack 
validity (Gediga et al., 2002). 



on usability as well as attempts to motivate the 
prioritization of development resources in imple- 
menting design changes. Following Nielsen (1994), 
such metrics typically incorporate reliability and 
representativeness with the predicted impact of the 
problem (i.e., whether a work-around exists and 
can be easily discovered, or if the problem will be a 
“showstopper” and prevent task completion or fur- 
ther use of the system). Thus, they often combine 
objective measures with an analyst’s subjective 
assessment. 
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Communicative Effectiveness 
and Efficiency 

High-quality usability data can be easily translated by 
usability analysts into problem descriptions that faith- 
fully represent the underlying deficiencies indicated 
by the data. The practical importance of complete- 
ness (and lack of noise) is clear to analysts who must 
decipher large amounts of data within tight develop- 
ment cycles. Usability data often suggest multiple 
possible design responses and vary in the amount of 
analysis time required to produce problem descrip- 
tions (Preece et al., 1994), the amount of detail and 
surrounding context made available by the collection 
method or tool (Hartson & Castillo, 1998), and the 
extent to which they refer to causes or effects of a 
deficiency (Boren & Ramey, 2000; van Welie, van 
der Veer, & Eliens, 1999). Each of these variables 
contributes to a usability analyst’s ability to quickly 
and accurately translate the data into problem de- 
scriptions and ultimately potential solutions. Empiri- 
cal usability-data collection is largely an attempt to 
invoke feedback more specific than “looks good to 
me,” and this can be particularly difficult with remote 
methods in which the opportunity for follow-up may 
not exist. 

Severity 

High-quality usability data indicate deficiencies that 
are not simply annoyances, but seriously impact the 
quality of users’ experiences with the system and 
ability to carry out their work. In many cases, data 
collection processes are expected to ignore the noisy, 
less serious (even if real) problems to maximize cost 
effectiveness. Severity metrics are in some sense 
predeployment predictions of the data’ s likely effects 



Downstream Utility 

High-quality usability data persuades developers 
and other stakeholders in the product development 
cycle to implement design changes. Downstream 
utility (or persuasiveness) is conceptualized in terms 
of the likelihood of usability data contributing to a 
change in a system’s interface (John & Marks, 
1997; Sawyer, Flanders, & Wixon, 1996). Because 
the tracking of usability data in real development 
cycles and the assessment of that data’s influence 
on design activities are difficult tasks, little research 
has investigated persuasiveness at the granularity 
of individual user comments or problem reports 
(Ebling & John, 2000). However, the focus on 
UEMs for this particular dimension of data quality 
is also reflective of two things. First, it recognizes 
that within user-centered design processes, data 
come in packages that often center around indi- 
vidual usability tests or heuristic reviews; that is, 
work products often simply report the results of 
individual tests that apply a single method. Second, 
it recognizes that stakeholders downstream have 
different levels of confidence in usability work 
based on the method. Severe usability problems 
sandwiched between low-quality data or appearing 
in reports for methods that have not gained stake- 
holder respect may be less persuasive by associa- 
tion. 

Impact 

High-quality usability data result in design changes 
that improve system usability. Data that is of high 
quality along the dimensions already discussed, 
prior to deployment, are typically assumed to en- 
sure impact. While this assumption is imperfect 
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(what made a problem severe or who a system’s 
representative users were during the development 
cycle, for example, may change by the time the 
system is in live use), it is in practice an effective 
way of ensuring usability improvements. John and 
Marks (1997) first formalized the “design-change 
effectiveness” of usability data by tracking the 
data’ s influence on design changes and subsequently 
observing the effects of these implemented changes 
in a development cycle, using a live system usability 
test as the benchmark for comparison. Their work 
remains a rare attempt to observe data impact. 

Usability-data impact is of course not restricted 
to individual process cycles and products, and is not 
always geared toward correcting specific problems. 
Performance measures, for example, are often ex- 
plicitly meant to determine general pain points in an 
interface to guide subsequent data collection and 
generally drive usability resource allocation (Dillon 
& Morris, 1999; Gray & Salzman, 1998). Such data 
can also guide selection amongst design alternatives 
(Nielsen & Phillips, 1993). Usability data can result 
in long-term guidelines, standards, and design pat- 
terns applied to subsequent products (Henninger, 
2001), validate system improvements leading to 
increased usability funding and visibility, increase 
organizational acceptance of usability processes (C. 
Karat, 1994), indicate needed change in these pro- 
cesses, and open the eyes of developers and other 
stakeholders (particularly empirical data such as 
video from usability tests) to end-user needs and 
behaviors (Englefield, 2003). Finally, as user-cen- 
tered design processes are iterative and involve 
attempts to improve prototypes as much as possible 
in each iteration, quantitative data can be instrumen- 
tal in indicating when to stop iterating to maximize 
cost effectiveness. 

Cost (Effectiveness) 

High-quality usability data achieve impact at rela- 
tively low cost. Costs include the resources neces- 
sary to collect, analyze, and react to the data by 
implementing design changes. In more constrained 
experiments, time is often used as a simple proxy for 
cost, comparing the quality of the collected data to 
the time taken in collection (Englefield, 2003). 

To be fair, while comparing usability improve- 
ments to cost is useful, it is in some sense meaning- 



less outside the perspective of usability. For this 
reason, researchers have attempted to analyze the 
return on investment (ROI) of usability work, look- 
ing at broader impacts of usability data. Usability 
engineering methods have been argued to reduce 
development time and cost, reduce call center and 
support costs due to decreased usability deficien- 
cies, reduce system training costs (Nielsen, 1993b), 
increase the customer base, retain customers due to 
satisfaction with the system, and ultimately increase 
product sales (Bias & Mayhew, 1994). 

These effects appropriately do not refer directly 
to the usability of the system; the target of ROI 
analysis is the entity incurring the cost. Benefits to 
end users are relevant only insofar as they result in 
benefits to those investing in usability-data collec- 
tion. As a result, the connection between system 
usability and ROI depends in part on the type of 
system being evaluated. It is commonly noted that in 
e-commerce, the effect of usability and buying 
behavior is relatively straightforward since a 
showstopper in usability is necessarily a showstopper 
in completing a transaction. Similarly, the usability of 
internal systems such as intranets leads to increased 
employee productivity and satisfaction for the com- 
pany footing the usability bill. B ut in contexts where 
buying decisions are not made by end users and 
instead by customers who have varying levels of 
engagement with end users, the connection, and the 
ROI of usability-data collection, is less clear-cut. 

Even in instances in which the end user makes 
the buying decision, the typical context of those 
decisions likely impacts the relationship between 
usability and ROI. Lesk (1998) gives the example of 
end users making buying decisions at trade shows 
(or computer stores), in which case actual system 
usability may take a backseat to the perceived 
usability achieved by a quick interaction with a demo 
or display unit. Generally, the connection between 
usability and system acceptance is not entirely clear. 
Dillon and Morris (1999) review system acceptance 
models indicating perceived usefulness to be a more 
powerful predictor, but as one might intuitively pre- 
dict, usability may over time influence continued use; 
that is, actual usability begins to impact the user 
perceptions that under many circumstances power- 
fully influence system acceptance. 
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FUTURE TRENDS 

The range of factors influencing usability data’s 
likelihood of influencing system design, in conjunc- 
tion with concerns orthogonal to system usability, 
leave many challenges open for usability research. 
This section identifies a few important research 
areas for usability-data-quality assessment. 

Customization Benefits and 
Unforeseen Pitfalls 

Systems that are customized (whether at the orga- 
nizational or individual level) frequently introduce 
usability benefits to end users. Organizations may 
configure systems to most efficiently support their 
business processes, and end users may find in- 
creased satisfaction with a system personalized to fit 
their work styles and tastes. However, customization 
can clearly open a can of worms for usability pro- 
cesses since ill-guided customization introduces us- 
ability deficiencies. In some sense, highly 
customizable systems have the potential to take 
system usability out of the hands of user-centered 
design. Assessing these systems requires account- 
ing for the extent to which they are likely to prevent 
the introduction of usability deficiencies through 
customization. The more flexible the customization, 
the less trivial this assessment becomes as it re- 
quires performing formative evaluations of a system 
to which customers and end users will employ 
frequent and unforeseen adjustments. 

From Methods to Individual Data 



Usability Process Assessments 

In contrast to investigating the influence of individual 
pieces of usability data, arguably the most critical 
area for usability data quality research lies in identi- 
fying optimal usability processes. While many stud- 
ies have attempted to address the relative merits of 
individual collection methods and the features of 
these methods (such as the optimal number of 
participants for empirical testing), little empirical 
work has attempted to piece together the most 
effective aggregations of these methods for collect- 
ing usability data of maximal scope and the optimal 
coordination of these methods. While a good deal of 
conventional wisdom exists regarding such coordi- 
nation, usability research can offer to validate and 
refine this wisdom. 
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CONCLUSION 

The quality of usability data depends on how well 
they predict real (and severe) system deficiencies 
experienced by a wide range of users, how easily 
and successfully they can be analyzed by usability 
experts, and how easily they can be reacted to in 
producing an improved system design. Usability 
research attempts to accurately assess the quality of 
usability data by observing their effects throughout 
product development cycles. Difficulties in assess- 
ing usability data quality arise from numerous sources, 
including often unforeseeable mismatches between 
the assessment and actual system-usage environ- 
ments. 



The focus on usability evaluation methods as the unit 
of analysis for several data-quality dimensions leaves 
open a number of interesting questions. While a 
piece of data’s collection method is likely a critical 
attribute for data quality (as mentioned for down- 
stream utility in particular), more fine-grained analy- 
ses investigating how user comments and problem 
reports make their way through development cycles 
and influence system design is an open area for 
future usability research. 
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KEY TERMS 

Analytic Method: Method in which a system is 
evaluated based on its interface design attributes 
(typically by a usability expert). 

Empirical Method: Method in which a system 
is evaluated based on observed performance in 
actual use. 

Formative Evaluation: The collection of us- 
ability data during the development of a system in 
order to guide iterative design. 

Summative Evaluation: The collection of us- 
ability data to evaluate a completed system in use. 

Usability Data: Any information used to mea- 
sure or identify factors affecting the usability of a 
system being evaluated. 

Usability Data Quality: Extent to which usabil- 
ity data efficiently and effectively predict system 
usability in actual usage, can be efficiently and 
effectively analyzed, and can be efficiently and 
effectively reacted to. 

Usability Evaluation Method (UEM): Method 
or technique that can assign values to usability 
dimensions and/or indicate usability deficiencies in a 
system. 
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INTRODUCTION 

The term affordance was coined by Gibson (1977, 
1979) to define properties of objects that allow an 
actor to act upon them. Norman (1988) expanded on 
this concept and presented the concepts of real and 
perceptual affordances in his book The Psychology 
of Everyday Things. Norman was essentially the 
first to present the concept of affordance to the field 
of human-computer interaction (HCI). 

Since then, affordance as a term has been used 
by many designers and researchers. But as Norman 
(1999) explained, many of the uses of the term are 
vague or unclear, which prompted the writing of his 
1999 article in the Interactions periodical. In fact, 
there have been many publications that try to eluci- 
date the term (see Hartson, 2003; McGrenere & Ho, 
2000 ). 

This article will try to provide a brief overview of 
the term and its many subclasses. It will try to give 
the reader a clear idea about what affordance is and 
how the concept can be used to allow designers and 
researchers to create better user interfaces and 
better interaction devices. The article however, 
does not try to clear up any ambiguities in the usage 
of the term in the literature or present a new way of 
viewing affordance. Rather, it tries to provide a 
short overview of the literature around affordance 
and guide the reader to a correct understanding of 
how to use affordance in HCI. 



BACKGROUND 

This section presents the evolution of the concept of 
affordance. It presents the creation of the term by 
Gibson (1977, 1979), and the way that affordance 
was incorporated into HCI. 



Gibson’s Affordance 

As mentioned in the introduction, Gibson (1977, 
1979) was the one who coined the term affordance 
to refer to the actionable properties between the 
world and an actor (whatever that actor may be; 
Gibson as cited in Norman, 1999). Gibson did not 
create the term to refer to any property that may be 
observable by the actor. Rather, he referred to all 
the properties that allow the actor to manipulate the 
world, be they perceivable or not. Thus, in Gibson’ s 
view, an affordance is just a characteristic of the 
environment that happens to allow an actor to act 
upon the environment. In this view, saying that a 
designer has added an affordance to a device or an 
interface does not immediately mean that the device 
or the interface has becosme more usable, or that the 
user would be able to sense the affordance in any 
way that would help him or her understand the usage 
of that device or interface. In fact, in Gibson’s 
definition, an affordance is not there to be perceived. 
The affordance just exists and it is up to the actor to 
discover the functionality that is offered by the 
affordance. It is just a feature of the environment. 

Norman’s Affordance 

Norman (1988) tookthe term affordance from Gibson 
(1977, 1979), and in his book The Psychology of 
Everyday Things, he elaborated upon it, creating 
something quite different from the original defini- 
tion. Norman did not change the original term. 
Rather, he introduced the concept of perceived 
affordance, which defines the clues that a device or 
user interface gives to the user as to the functionality 
of an object. He also distinguished it from Gibson’s 
affordance, which he named real affordance. We 
will mention probably the most used example of 
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affordance in HCI to clarify the difference between 
a real affordance and a perceived affordance. Con- 
sider a door that opens when pushed having a flat 
plate that takes the place of the door handle (Figure 
lb). The design of the door handle gives out the clue 
that the door is not supposed to be pulled since there 
is no handle that the actor c an grab in order to pull the 
door. Conversely, a door handle that can be grabbed 
(Figure la) gives out the clue that the door opens 
when pulled. However, as Norman (1988) points 
out, this convention is not always followed, resulting 
in people thinking that they cannot figure out how to 
open a door whereas the problem lies in bad design 
and bad use of a perceived affordance. The differ- 
ence between the real affordance, or the affordance 
as defined by Gibson, and the perceived affordance 
in Norman’s definition is that the door affords to be 
opened in some way but the perceived affordance 
that the flat panel gives out is that the door can be 
opened by pushing on the panel. 

Norman (1988) concludes that well-designed 
artifacts should have perceived affordances that 
give out the correct clues as to the artifacts’ usage 
and functionality. 

Gaver’s Affordance 

Gaver (1991) wrote an article in which he also 
creates a definition of affordance, but he breaks 
affordance down into four different categories. Gaver 
defines perceptible affordance, false affordance, 
correct rejections, and hidden affordance (Figure 



2). Perceptible affordance is the affordance for 
which there is perceptual information for the actor to 
perceive. This type of affordance would fall under 
Norman’s (1988) perceived-affordance definition. 
Conversely, if there is information that suggests that 
an affordance is there when there is none, then that 
is a false affordance. A hidden affordance is an 
affordance for which no perceptual information 
exists. Finally, a correct rejection is the case when 
there is no perceptual information and no affordance. 

In Gaver’s (1991) terms, affordance is the exist- 
ence of a special configuration of properties so that: 
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physical attributes of the thing to be acted upon 
are compatible with those of an actor, that 
information about those attributes is available in 
a form compatible with a perceptual system, and 
(implicitly) that these attributes and the action 
they make possible are relevant to a culture and 
a perceiver. (Gaver, 1991, p. 81) 



In fact, Gaver (1991) united the two concepts of 
real and perceived affordance, and named the sys- 
tem of the property of an object and the ability of that 
property to be perceived as affordance. 

Hartson’s Affordance 



Hartson (2003) used the concept of affordance to 
create the User Action Framework (UAF). He used 
the concept by basing it on Norman’s (1988) defini- 
tion, but also redefining it to make the distinction 



Figure 1. Two door handles, one (a) very 
confusing as to its usage, and one (b) which gives 
clues as to its usage 




b 



Figure 2. Separating types of affordance from 
information available about them (Gaver, 1991) 
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between each type of affordance that can be encoun- 
tered clearer. He refers to four different types of 
affordance: physical, cognitive, sensory, and func- 
tional. He defines physical affordance as a feature of 
the artifact that allows the actor to do something with 
it. A cognitive affordance is the information that 
allows the actor to realize that the physical affordance 
is there. A sensory affordance is the information that 
the actor gets before it gets processed at a cognitive 
level, and a functional affordance is the usefulness 
that the physical affordance gives to the actor. The 
metal plate on the door from the previous example 
can be used to elucidate the differences between 
each affordance type that Hartson proposes. The 
physical affordance of the plate is the feature that 
allows the placement of the hand of the user on the 
door so that the user can open the door. The cognitive 
affordance is the combination of information from 
the user’s knowledge and the appearance of the plate 
that allows the user to realize whether the door is 
opened by pulling or pushing. If one assumes that the 
metal plate has Push engraved on it, then the clarity 
of the lettering and the size and shape of the letters 
that allow the user to clearly make them out is the 
sensory affordance. Finally, the functional affordance 
is the placement of the plate at the correct position on 
the door as to allow for the easiest opening of the door 
if the user pushes on the metal plate. 

Hartson (2003) goes on to propose the UAF, 
which is a framework for designing systems and 
artifacts. The framework is based on the four types 
of affordance that he proposes. For more information 
on the UAF, the reader is referred to Andre, Belz, 
McCreary, and Hartson (2000), Andre, Hartson, 
Belz, and McCreary (2001), and Hartson, Andre, 
Williges, and Van Rens (1999). 

DISCUSSION 

Affordance is perhaps one of the most exciting 
concepts in HCI. The introduction of the concept by 
Norman (1988) created a big stir and, consequently, 
it created a lot of discussion because of its inconsis- 
tent usage. Many, along with Norman, claim that 
affordance as was presented in the POET book was 
not understood correctly by the HCI community, 
something that triggered a lot of discussion around 



the concept and a lot of literature trying to elucidate 
the meaning of this term. 

People like Gaver (1991), Hartson (2003), and 
McGrenere and Ho (2000) have also provided 
frameworks or theories that are based on affordance 
that can help designers design better user inter- 
faces and interaction devices. For example, a de- 
signer could create a user interface and make sure 
that the buttons are clearly labeled so that the user 
can easily read the labels (sensory affordance) to 
understand what the button does (cognitive 
affordance) in order to use it correctly (functional 
affordance; Hartson). This example uses the types 
of affordance that Hartson proposes and shows 
how one could think about the different types of 
affordance in Hartson’ s definition to create a better 
interface. 

The concept of affordance is indeed useful in 
HCI because, at the very least, it forces the de- 
signer to think about the information that he or she 
is giving to the user by the very design of the user 
interface or interaction device. 

Another short example that uses Norman’s 
(1988) definition of affordance may elucidate the 
usage of the concept in interaction-device design. 
Suppose the design of a keyboard for a palm device 
(much like the one on the Handspring Treo de- 
vices). The keys on this keyboard are restricted by 
the size of the device, which means the physical 
affordance of a user pressing the buttons is hin- 
dered. However, the designer who thinks about this 
in the design stage will have already taken this 
affordance into account and may allow for slightly 
bigger buttons by designing the device with slightly 
bigger dimensions at the bottom where the key- 
board lies. Other examples in interaction-device 
design that take account of affordance can be found 
in the tangible-user-interfaces (Ishii &Ullmer, 1997) 
literature. 

When thinking about affordance, sometimes it 
does not matter which affordance definition one 
might use as long as the definition allows for the 
design to become more usable. In both cases above, 
one can see that by taking into account the concept 
of affordance, the design of a device becomes a 
little more usable and maybe even a little more 
foolproof. 
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FUTURE TRENDS 

Affordance is an always-evolving concept. Re- 
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notations for describing affordance (Steedman, 2002). 
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2001) and Sound Canvas (Cheung, 2002). There are 
many more avenues for research in the concept of 
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CONCLUSION 

This article presented a brief overview of the usage 
of the concept of affordance in HCI. A brief history 
of the creation of the term was provided, along with 
some of the major contributions to the evolution and 
clarification of the meaning of the concept. It briefly 
discussed how the concept could be used by design- 
ers in order to create more usable user interfaces 
and interaction devices. It also mentioned some of 
the frameworks that have been created to facilitate 
the design process based on this concept. 

By reviewing the most prevalent definitions, a 
foundation was provided upon which one could build 
a solid understanding of what affordance is. When 
used correctly, affordance can provide the user of a 
user interface or interaction device clues as to how 
to use it correctly even if the user has little training 
with the interface or device. 

Finally, some examples were provided to demon- 
strate how a designer could incorporate thinking 
about affordance at the design stage of an interface 
or interaction device so that the artifact created 
would be made more usable in some way. 
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KEY TERMS 

Cognitive Affordance: According to Hartson 
(2003), this type of affordance is the combination of 
the information that allows the user to understand 
what the purpose of the artifact that has this 
affordance is. 

False Affordance: Gaver (1991) used this term 
to refer to information that makes the actor think that 
there is an affordance when in fact there is none. 



Functional Affordance: The last of Hartson’ s 
(2003) definitions, this type of affordance is the 
feature of the artifact that allows the actor to 
actually accomplish the work that the artifact is 
supposed to perform (the usefulness of the artifact). 

Hidden Affordance: Gaver (1991) used this 
term to represent the affordance of an artifact that 
the user cannot perceive. Thus, while the affordance 
is there, there is no perceptible information for the 
actor to realize that the affordance is there. 

Perceived or Perceptible Affordance: The 

term perceived affordance was created by Norman 
(1988), whereas perceptible affordance was coined 
by Gaver (1991). They both refer to a property of an 
artifact that provides observable cognitive clues as 
to its usage and function by an actor. 

Real or Physical Affordance: The term 
affordance was first proposed by Gibson (1977). 
The term real affordance was proposed by Norman 
(1988), and the term physical affordance was pro- 
posed by Hartson (2003). They all refer to the same 
definition proposed by Gibson, which is that 
affordance is an actionable property between the 
world and an actor. An affordance does not have to 
be perceptible by the actor. 

Sensory Affordance: Again, one of Hartson’ s 
(2003) definitions, this affordance is a feature of an 
artifact that helps the user sense something about 
the artifact. 
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INTRODUCTION 

The notion that the human information processing 
system has a limit in resource capacity has been 
used for over 100 years as the basis for the investi- 
gation of a variety of constructs and processes, such 
as mental workload, mental effort, attention, elabo- 
ration, information overload, and such. The dual 
task or secondary task technique presumes that 
the consumption of processing capacity by one task 
will leave less capacity available for the processing 
of a second concurrent task. When both tasks 
attempt to consume more capacity than is available, 
the performance of one or both tasks must suffer, 
and this will presumably result in the observation of 
degraded task performance. 

Consider, for example, the amount of mental 
effort devoted to solving a difficult arithmetic prob- 
lem. If a person is asked to tap a pattern with a finger 
while solving the problem, we might be able to 
discover the more difficult parts of the problem 
solving process by observing changes in the perfor- 
mance of the secondary task of finger tapping. 
While a participant is reading a chapter of text in a 
book or on a Web browser, we might be able to use 
this same technique to find the more interesting, 
involving, or confusing passages of the text. Many 
implementations of the secondary task technique 
have been used for more than a century, such as the 
maintenance of hand pressure (Lechner, Bradbury, 
& Bradley, 1998; Welch, 1898), the maintenance of 
finger tapping patterns (Friedman, Poison, & Dafoe, 
1988; Jastrow, 1 892; Kantowitz& Knight, 1976), the 
performance of mental arithmetic (Bahrick, Noble, 
& Fitts, 1954; Wogalter & Usher, 1999), and the 
speed of reaction time to an occasional flash of light, 
a beep, or a clicking sound (e.g., Bourdin, Teasdale, 
& Nourgier, 1998; Owen, Lord, & Cooper, 1995; 
Posener & Bois, 1971). 



In using the secondary task technique, the par- 
ticipant is asked to perform a secondary task, such 
as tapping a finger in a pattern, while performing the 
primary task of interest. By tracking changes in 
secondary task performance (e.g., observing erratic 
finger tapping), we can track changes in processing 
resources being consumed by the primary task. This 
technique has been used in a wide variety of disci- 
plines and situations. It has been used in advertising 
to study the effects of more or less suspenseful parts 
of a TV program on commercials (Owen et al., 
1995) and in studying the effects of time-com- 
pressed audio commercials (Moore, Hausknecht, & 
Thamodaran, 1986). It has been used in sports to 
detect attention demands during horseshoe pitching 
(Prezuhy & Etnier, 200 1 ) and rock climbing (B ourdin 
et al., 1998), while others have used it to study 
attention associated with posture control in patients 
who are older or suffering from brain disease (e.g., 
Maylor & Wing, 1996; Muller, Redfern, Furman, & 
Jennings, 2004). Murray, Holland, and Beason( 1998) 
used a dual task study to detect the attention de- 
mands of speaking in people who suffer from apha- 
sia after a stroke. Others have used the secondary 
task technique to study the attention demands of 
automobile driving (e.g., Baron & Kalsher, 1998), 
including the effects of distractions such as mobile 
telephones (Patten, Kircher, Ostlund, & Nilsson, 
2004) and the potential of a fragrance to improve 
alertness (Schieber, Werner, & Larsen, 2000). 
Koukounas and McCabe (2001) and Koukounas and 
Over (1999) have used it to study the allocation of 
attention resources during sexual arousal. 

The notion of decreased secondary task perfor- 
mance due to a limited-capacity processing system 
is not simply a laboratory curiosity. Consider, for 
example, the crash of a Jetstream 3101 airplane as 
it was approaching for landing, killing all on board. 
The airplane had deviated slightly from its course, 
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and shortly after, the flight crew declared an emer- 
gency to the approach controller, attributing engine 
failure as the cause. The U.S. National Transporta- 
tion Safety Board (NTSB, 2000), however, con- 
cluded that the airplane simply ran out of fuel and 
that the crew had not considered this possibility. The 
airplane’s performance capabilities and simulator 
tests suggested that the flight crew still should have 
been able to land the airplane with the first engine 
out, with the second engine erratic, or with both 
engines out. The NTSB report surmised that the 
failure of the first engine could have caused the 
pilots to “fixate on instruments such as the altitude 
indicator and airspeed indicator and to allow the 
course heading to wander” (NTSB). 

In the same way, we can observe erratic or 
degraded performance on tasks that are performed 
concurrently with other ordinary, everyday tasks, 
such as watching TV, reading from a book or 
computer screen, or driving a car. If we can observe 
erratic or degraded performance on a secondary 
task, then we can presume that the primary task of 
watching TV, reading, or browsing a Web site is 
consuming quite a lot of the person’s mental pro- 
cessing capacity. There are three conclusions that 
we can draw from such observations. 

1 . We need to consider this limited capacity of the 
human processing system and the potential for 
dysfunctional performance when designing 
human-machine systems such as aircraft, au- 
tomobiles, ordinary and everyday office com- 
puter applications, Web sites, and so forth. 

2. We can use this observation of degraded per- 
formance on concurrent tasks as a way to 
identify human overload or failure points in a 
human-machine system. 

3. We can use this observation of degraded per- 
formance as a measure of a variety of human 
mental processes, such as attention, mental 
effort, information overload, and such. 

The first issue is the motivation behind this ar- 
ticle. The remainder of this article, however, will 
focus on the latter two issues. First will be a brief 
theoretical discussion on how interference or dys- 
functional mental processing performance occurs 
from a black-box perspective of the system. This 
will be followed by a discussion of how this interfer- 



ence can be observed with the so-called dual task or 
secondary task technique, used in the measure of 
mental overload, mental attention, mental effort, and 
such. 



BACKGROUND 

The concept of information overload is based on the 
assumption that the human information processing 
system has a limit in its capacity to process informa- 
tion. Most of us could effortlessly add two 2-digit 
numbers, but would experience extreme difficulty in 
attempting to add three 10-digit numbers without 
some additional scratch-pad memory in the form of 
a pencil and paper. The operationalization of evi- 
dence for information overload relies on the prob- 
ability of errors in task performance. 

Studies in the 1950s and 1960s attempted to 
locate a bottleneck in the processing system as if it 
was a single-channel serial transmission line (cf 
Welford, 1967). Broadbent (1954, 1957) proposed 
that there was a many-to-one selection switch in the 
channel, with throughput limited by how fast this 
switch could operate in selecting parallel input sig- 
nals. Moray (1967), however, proposed that the 
system behaved instead like a flexible central pro- 
cessor of limited capacity. 

The idea of a limited-capacity central processor 
was furthered by Kahneman (1973), Kerr (1973), 
and others. The idea was that the processing system 
is very flexible in the kinds of tasks that it can 
process concurrently at any given instant, but that it 
is very limited in its overall size. Kahneman viewed 
the earlier models of processing as explanations of 
structural limitations in processing. We cannot, for 
example, focus our eyes on two objects simulta- 
neously. The limited-capacity processor model was 
proposed by Kahneman and contemporaries as an 
explanation of how some mental processing tasks 
can be performed concurrently. 

There is currently no single correct view of the 
mechanisms that cause the human processing sys- 
tem to be limited in its ability to process information. 
Importantly, we know that the human processing 
system is not just a single-resource processor, but 
that there are multiple resources that can be limited 
(cfFriedmanetal., 1988; Rollins & Hendricks, 1980; 
Triesman & Davies, 1973). From a practical per- 
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spective, however, we can still observe degraded 
performance when the processing system is asked to 
perform too much work. Whether the actual cause is 
due to a bottleneck in a serial system, division or 
sharing in a single-processor system, or division or 
sharing in a multiple-resource system, we can as- 
sume that the system is being swamped somehow, 
somewhere if we can observe concurrent task inter- 
ference. 



SECONDARY TASK METHODS THAT 
CAN BE USED TO MEASURE 
ATTENTION 

RT Probe 

In using the RT (reaction time) probe, the participant’ s 
reaction time in responding to a secondary stimulus is 
of interest. As the demands for processing the pri- 
mary task increase, we begin to see interference with 
the performance of the secondary task, manifested 
by (often in this order) increased reaction times, 
greater variance in reaction times, and misses (fail- 
ure to react to the stimulus) and false alarms (react- 
ing in absence of a stimulus). When using this proce- 
dure, the quality of responses to the secondary stimu- 
lus serves as a probe into, or sensor of, the processing 
demands of the primary task. Measures of decreased 
performance on the secondary task are taken to 
indicate increased consumption of processing re- 
sources by the primary task. 

Secondary stimuli are typically implemented as 
randomly spaced beep sounds (e.g., Owen et al., 
1995) or brief flashes of light at random intervals 
(e.g., Moore et al., 1986; Stapleford, 1973). Varia- 
tions on these methods could also be used. In the 
Baron and Kalsher (1998) study, while participants 
performed a simulated automobile driving task on a 
computer, they were to push a button as rapidly as 
possible after the presentation of a stop sign that 
appeared on the screen at random intervals. Partici- 
pants are typically asked to press a handheld button 
in response to secondary stimuli, but vocal responses 
are also often used. The Bourdin et al. (1998) rock 
climbing study took vocal reaction times through a 
helmet microphone in response to auditory beeps. 



Tapping Task 

Jastrow (1892) describes the use of finger tapping 
tasks (secondary tasks) performed concurrently 
with such processes as mental math and reading 
under various conditions (primary tasks). Jastrow 
listed the following as secondary tasks. 
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Tapping a finger at a regular rate of whatever 
the participant chooses 
Tapping a finger at a regular rate but as 
quickly as possible 

Tapping along with (paced by) a metronome 
Tapping in groups of twos, threes, fours, and 
so forth 

Tapping in alternate groups of threes and 
twos, of sixes, fours, and twos, and so forth 



By using secondary tasks that were more or less 
difficult, Jastrow was able to take attention mea- 
sures that worked under different conditions. 

Kantowitz and Knight (1976) used finger tap- 
ping paced visually with a computer-timed light 
blink. Friedman et al. (1988) used rapid finger 
tapping. Note that some such secondary tasks could 
be difficult enough that they themselves interfere 
with primary task performance. In the study of 
Friedman et al., interest was not so much in the 
performance of the finger tapping task, but in the 
way that the finger tapping task interfered with the 
ability of participants to recall nonsense words that 
had been displayed during the finger tapping. In 
some dual task studies such as this, we might expect 
both tasks to interfere with each other. Often, when 
dual task studies refer to the use of the secondary 
task technique, however, the design attempts to 
keep the secondary task from interfering with the 
performance of the primary task such that degrada- 
tions in the performance of the secondary task (and 
not vice versa) serve as a probe into processes 
associated with the primary task. 



Grip Maintenance 



Welch (1898) describes a device that she was using 
to take quantitative measures of attention in the 
1 800s. The participant was asked to hold a constant 
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grip on a spring-loaded handle. The handle was 
attached to a lever; at the end of the lever was a pen 
that left a mark on a revolving drum. As the partici- 
pant loosened or tightened his or her grip on the 
handle, the pen moved one way or the other on the 
revolving drum. With this apparatus, Welch could 
trace physical changes in grip over time. Welch 
observed that error in maintaining a constant grip 
corresponded with an increase in effort and with an 
increase in the number of simultaneous tasks that the 
participant was asked to perform. 

Lechner et al. (1998) describe grip maintenance 
as a method that has seen recent use in studies of 
sincerity of effort in physical therapy. It otherwise 
does not seem to be in common use as a secondary 
task probe. However, it seems that it would not be 
especially difficult to implement this method through 
a mouse on a computer. A spring could be attached 
to the mouse, or a weight could be tied to the mouse 
cord hanging over the back edge of the table. The 
secondary task would be to maintain the mouse in a 
constant position against the force of the weight or 
spring. 

Other Secondary Task Measures 

Mental arithmetic is sometimes used as a secondary 
task (e.g., Bahricket al., 1954). Wogalter and Usher 
(1999) asked participants to say answers to math 
problems aloud while attempting to install a com- 
puter hard-disk drive according to the instruction 
manual. Brown, McDonald, Brown, and Carr (1988) 
paired handwriting with listening. In a simulated 
automobile driving task, Young and Stanton (2002) 
asked participants to judge whether a pair of geo- 
metric shapes in the lower left corner of the screen 
was the same or different by pressing buttons at- 
tached to the steering stalk. 

Note that we can pair almost any set of tasks in 
which performance for one of the tasks runs across 
a range from baseline performance to degraded 
performance, with degraded performance taken as 
a measure of increased resource consumption by the 
other task. Schieber et al. (2000), for example, 
paired the tasks of tuning a radio while driving a car. 
Although our interest might be associated with is- 
sues of automobile driving, we are actually inter- 
ested in radio tuning as the primary task while 
observing degradations in driving performance as a 



secondary task measure of the amount of processing 
resources being consumed by radio tuning. 

FUTURE TRENDS 

This article has described some uses of the second- 
ary task technique in the measure of attention, 
mental effort, and such over the past century. Al- 
though the technique can be relatively simple to 
implement and relatively low tech in many (but 
certainly not all) cases, it nonetheless remains use- 
ful. In recent years, there have been attempts to 
combine the dual task technique with more sophisti- 
cated methods such as magnetic resonance imaging 
(use of magnets and radio waves to construct brain 
pictures; see discussion in Corbetta & Shulman, 
2002) , but it is unlikely that more complicated meth- 
ods will replace the secondary task technique in the 
foreseeable future. The secondary task technique is 
appealing in that it is very portable and relatively 
unobtrusive to the environment of the task under 
study. 

Greater use of the secondary task technique is 
therefore advocated for use in the study of ordinary 
and everyday human-computer systems. For ex- 
ample, it could easily and unobtrusively be incorpo- 
rated into the usability testing of Web sites. While 
the participant is involved in an assigned task during 
a Web site usability test, he or she could be asked to 
simply tap a finger at a regular pace. The casual 
observation of changes in the performance of this 
secondary task could be taken as an objective obser- 
vation that a particular step in the primary task is 
consuming a substantial amount of processing re- 
sources. Nothing in the usability study needs to be 
altered in order to implement the secondary task 
technique in this way. 

CONCLUSION 

The secondary task technique is portable, relatively 
uncomplicated, and relatively unobtrusive as a probe 
into a variety of mental processes associated with 
attention. It has been in use for over a century and 
continues to be used in a variety of disciplines. 
Although it has some limitations (see Owen, 1991), 
these are not likely to be of issue in practical 
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applications such as usability testing. This article has 
not detailed the how-to aspects in the use of this 
technique, but the implementation of some of the 
simpler uses, such as the observation of degraded 
performance in finger tapping, should be reasonably 
obvious. A more detailed account of how to imple- 
ment the RT-probe technique, useful in settings that 
might require more rigorous tests, can be found in 
Owenet al. (1995). Simpler implementations such as 
finger tapping tasks, however, should be adequate in 
many applied situations such as Web usability test- 
ing. 
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KEY TERMS 

Attention: Mental processing that consumes 
our conscious thinking. This is associated with a 
variety of more specific constructs such as mental 
effort, mental focus, mental elaboration, and such. 
Processes associated with the attention-related con- 
structs are what we presume to be detecting in dual 
task studies. 

Dual Task Study: A study in which two tasks 
are performed concurrently to observe changes in 
task interference. Usually, the participant is ex- 
pected to or asked to focus on the primary task so 
that interference is observed only in the secondary 
task, but this is not necessarily always the objective. 
Observations of task interference are taken to sug- 
gest that the limits of the processing system are 
being reached. 

Limited-Resource Model: The idea that the 
human information processing system has a limited 
pool of resources available for the concurrent per- 
formance of any number of tasks. The observation 
of degraded performance in one or more processing 
tasks is taken to suggest that the capacity of the 
system is being approached. 

Primary Task: The resource consumption task 
that is of interest in a secondary task study. Often 
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(but not necessarily), the participant is asked to 
focus on the performance of this task while concur- 
rently performing the secondary task. For example, 
the participant could be asked to focus on reading a 
passage of text on successive screens of a computer 
display (primary task) while concurrently pressing a 
handheld button switch whenever a random beep 
sound is heard (secondary task). 

RT Probe (Reaction Time Probe): A com- 
monly used secondary task in which changes in 
reaction time performance of the secondary task are 
of interest. 



Secondary Task: The task that is used as a 
probe in a secondary task study. Changes in the 
performance of the secondary task are taken to 
suggest changes in the processing of the primary 
task or the detection of processing system overload. 
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Secondary Task Technique: A dual task study 
in which one task is designated as the primary task 
of interest while a secondary task is concurrently 
performed as a probe to test the consumption of 
processing resources by the primary task. Changes 
in secondary task performance are taken to indicate 
changes in resource consumption by the primary 
task. 
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INTRODUCTION 

Information technology (IT), computer science, and 
other related disciplines have become significant 
both in society and within the field of education. 
Resulting from the last decades’ considerable devel- 
opments towards a global information society, the 
demand for a qualified IT workforce has increased. 
The integration of information technology into the 
different sectors of every day life is increasing the 
need for large numbers of IT professionals. Addi- 
tionally, the need for nearly all workers to have 
general computing skills suggests possibilities for an 
individual to face inequality or suffer from displace- 
ment in modern society if they lack these skills, 
further contributing to the digital divide. Thus, the 
importance of IT education has a greater importance 
than ever for the whole of society. 

Despite the advances and mass adoption of new 
technologies, IT and computing education continu- 
ally suffers from low participant numbers, and high 
dropout and transfer rates. This problem has been 
somewhat addressed by introducing mentoring pro- 
grams (von Hellens, Nielsen, Doyle, & Greenhill, 
1999) where a student is given a support person, a 
mentor, who has a similar education background but 
has graduated and is employed in industry. Although 
the majority of these programs have been consid- 
ered successful, it is important to note that it is 
difficult to easily measure success in this context. 

In this article, we introduce a novel approach to 
mentoring which was adopted as part of an ongoing, 
traditional-type mentoring program in a large Aus- 
tralian university. The approach involved introduc- 
ing modem communications technology, specifically 
mobile phones having an integrated camera and the 
capability to make use of multimedia messaging 



services (MMS). As mobile phones have become an 
integrated part of our everyday life (with high adop- 
tion rates) and are an especially common media of 
communication among young people, it was ex- 
pected that the use of the phones could be easily 
employed to the mentoring program (phones were 
provided for the participants). Short message ser- 
vice (SMS), for example text messaging, has be- 
come a frequently used communication channel 
(Grinter & Eldridge 2003). In addition to text, photo 
sharing has also quickly taken off with MMS capable 
mobile phones becoming more widespread. The 
ability to exchange photos increases the feeling of 
presence (Counts & Fellheimer, 2004), and the 
possibility to send multimedia messages with mobile 
phones has created a new form of interactive 
storytelling (Kurvinen, 2003). Cole and Stanton (2003) 
found the pictorial information exchange as a poten- 
tial tool for children’s collaboration during their 
activities in story telling, adventure gaming and for 
field trip tasks. 

Encouraged by these experiences, we introduced 
mobile mentoring as part of a traditional mentoring 
program, and present the experiences. It is hoped 
that these experiences can affirm the legitimacy of 
phone mentoring as a credible approach to mentoring. 
The positive and negative experiences presented in 
this article can help to shape the development of 
future phone mentoring programs. 

BACKGROUND 

Current education programs relating to information 
technology continue to suffer from low applicant 
numbers in relation to the available enrollment posi- 
tions. In the USA alone, the number of computer 
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science graduates dropped from a high of 50,000 in 
1986 to 36,000 in 1994, reported by the Office of 
Technology Policy in 1998 (vonHellensetal., 1999). 
Many general IT degrees also have high dropout 
rates, particularly in the transition from the first to 
second year of undergraduate studies. Student sta- 
tistics also show that university IT degree programs 
are not attracting the high achieving students, some 
possible reasons include the low entrance level 
scores needed to enter the program, the attraction to 
high-entrance level degree programs such as medi- 
cine, law, and psychology and the confusion and 
uncertainty relating to what a career in IT will entail 
(ASTEC, 1995). 

Misconceptions associated with understanding 
IT as a field specialized for those with masculine 
attributes exist and are reinforced by the teachings 
at secondary school level (Beekhuyzen & Clayton, 
2004; Greenhill, von Hellens, Nielsen, & Pringle, 
1997), thus often having a negative effect on stu- 
dents, particularly on females. Consistent results 
have been obtained in studies concerning high school 
physics, which faces similar difficulties and biased 
ideas as IT (Hakkila, Karkas, Aksela, Sunnari, & 
Kylli, 1998). A remarkable number of university 
students choose their area of study without any 
preliminary experience in the particular field. With 
information technology, the students also often have 
unclear or distorted perceptions of what to expect 
later in their studies or after graduation, including 
what kind of employment their area of study can 
offer (Nielsen, von Hellens, Pringle, & Greenhill, 
1999). 

Within the IT context, university student mentoring 
has been introduced to offer students insight into the 
industry and to employment possibilities enabling 
them to have them a closer look at the everyday life 
of working in the field. The aim is to dispel some of 
the misconceptions associated with what IT work is 
all about. When entering into this mentoring pro- 
gram, the student is matched with a personal mentor 
who has a similar educational background and is 
currently employed in the IT industry. Convention- 
ally, mentoring is carried out with face-to-face meet- 
ings, e-mail and telephone conversations between 
mentor and mentee. In line with many published 
studies, early results from our studies suggest that 
mentoring can provide valuable information on ca- 



reer possibilities, thus increasing the motivation of 
study and working in the area. It also clarifies and 
enhances student perceptions concerning the reali- 
ties of the field. Note: all participation in the program 
is of a voluntary basis, and no financial benefits are 
obtained. 

When commencing the traditional part of the 
mentoring program, mentors and mentees partici- 
pate in an initial short training session. In this session, 
the mentoring partners are introduced, and the role 
and expectations of mentors and mentees is dis- 
cussed. Mentor and mentee generally meet thereaf- 
ter on a regular basis during one semester period 
(usually 13-15 weeks) which is arranged as suits 
best for both parties. Face-to-face communication is 
also usually complimented by e-mail conversations. 
A mid-program event is organized by the Alumni 
Association, usually with a presentation by an indus- 
try representative on a pertinent topic such as net- 
working (in terms of meeting people, making con- 
tacts, etc. — a skill particularly useful within the IT 
industry). A final session is held to close the program 
and gather together all program participants to dis- 
cuss their experiences. 
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ENHANCING COMMUNICATION WITH 
MOBILE TECHNOLOGY 

In addition to the traditional mentoring methods 
being employed by the mentoring program in the 
university, we have introduced the use of mobile 
communication technology into the mentoring pro- 
gram. The primary aim in introducing the novel 
approach was to augment communication during the 
mentoring process. There was no aim to replace the 
conventional communication mediums but to add 
value with features offered by the mobile communi- 
cation device. A pilot study was conducted in 2003. 
Due to positive feedback, the approach has contin- 
ued to be integrated in the traditional program in 
2004. 

The equipment used in the experiment consists of 
two Nokia 7650 Mobile Phones, of which one was 
given to the mentee and one to the mentor for the 
duration of the program. The mentor was advised to 
communicate with the student about all which (s)he 
felt was a relevant part of their work and leisure, and 
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Figure 1. Two multimedia messages describing 
the work tasks of a mentor 




T have a conference paper deadline 
tomorrow, i just finished working with 
it. Here's a pile of articles and other 
stuff on my desk. Fm very messy, 
heh..! Anyway, i just heard i got a 
scholarship and i can go to that 
conference in Switzerland, yes!! Well, 
good night now... 




1 do paper prototyping and 
interviewing for my research project. 
One tiling i like in this field is that it 
offers so different kinds of tasks. I also 
get to talk and meet people quite a lot - 
funny that people think IT is not social! 



Mentor, 1:04 am 



Mentor, 9:48 pm 



especially to use picture messaging as an illustrative 
supplement. Using this type of technology to commu- 
nicate brings about many issues relating to human- 
computer interaction. For example, the size of the 
screen, the structure of the information being viewed/ 
sent (Chae & Kim, 2003), and the increasing com- 
plexity of functionality can lead to ineffective use of 
the mobile device. Flowever, benefits include the 
ability to access information from anywhere without 
the need to physically sit at a computer workstation 
(Chae & Kim, 2003). 

The student mentee was given a certain monetary 
amount (AUD $15 — Australian dollars) of pre-paid 
credit on the mobile phone, which they were allowed 
to use during the study. For example, the price for 
sending an SMS message and MMS message were 
0.20 AUD (20 cents) and 0.75 AUD (75 cents), 
respectively. The phone mentoring period lasted for 
one week for one mentor-mentee pair. In the begin- 
ning of the experiment period, the functions of the 
phone were explored together to ensure seamless 
communication. At the completion of the one-week 
period, the participants gave their feedback about the 
experience via a questionnaire. 

The media used in communications between the 
mentor and mentee were short messages (SMS) and 
multimedia messages (MMS), the latter to be more 
common. Conversations consisted mainly of one 
message or a message and a reply, where the reply 



included feedback or comment to the previous 
messages. The typical number of sent messages 
was two per day from mentor to mentee, and one 
from mentee to the mentor, although more mes- 
sages were exchanged if a message gave rise to a 
longer, more detailed conversation. The time for 
messaging was found to be varied from morning 
hours to past midnight and also sometimes during 
the weekend, as shown in Figures 1 and 2. 

The majority of messages sent contained a short 
description of the work task the mentor was cur- 
rently involved in, accompanied by a picture. In 
addition to the actual task, the messages often 
described the atmosphere at that particular moment 
and also included short opinions (see Figure 1). 
Some of the messages were not primarily related to 
the work tasks, but described more the mentor’s 
personal interests — free time, hobbies, and per- 
sonal preferences. 

The initiative for conversations containing pro- 
fessional information was taken by the mentor, and 
was not motivated by, for example, a question from 
a mentee. Flowever, mentees took initiative in re- 
porting about their duties related to studying. For 
instance, assignments and projects they were work- 
ing on. Conversations relating to free time were 
initiated equally by both parties. Examples of 
messages relating to free time are illustrated in 
Figure 2. 



Figure 2. Two free time orientated multimedia 
messages from a mentor (company name of the 
employer replaced with asterisk) 




I was rock climbing tonight, I do it 
every Tuesdays. I really like it! Feels 
just a bit hard to type now :) 



Mentor. 11:46 am 




Aah, it was nice to sleep in! By the 
way, we have pretty 1 flexible working 
hours in ••»**»*•. If there’s no 
meetings or anything in the morning, u 
can just come whenever and stay later 
in the end of the day. Suits me! 



Mentor. 10:27 am 



682 





Using Mobile Communication Technology in Student Mentoring 



FEEDBACK 

The results obtained from both sets of participating 
parties were positive and encouraging. The most 
positive aspects reported by the mentees were on 
increasing the frequency of the communications and 
thus developing a closer relationship and gaining a 
deeper insight for the mentor’s work. Comments 
collected from two students at the end of the mobile 
mentoring period are presented in the following: 

• Mentee #1: “I believe that mobile communi- 
cation is quick and easy. It gives you an oppor- 
tunity to learn more about your mentor and 
what they do and vice versa. It is especially 
good when both parties are unable to meet on 
a regular basis due to time constraints, commit- 
ments, etc.” 

• Mentee #2: “The best thing about the phone 
mentoring was that I was able to see how 
another person, in the field I want to work in, 
interacts with their life as well as being able to 
share aspects of my life with my mentor. It 
helped break the ice, enabling my mentor and 
myself to get to know each other.” 

Positive feedback obtained from mentors par- 
ticularly concerned the flexibility in regard to the 
place and time of the communication, ease of use, 
and the extra personal touch it gave to the conversa- 
tions. The amount of credit was reported to be 
sufficient for a one- week experimental period. Over- 
all, the integration of mobile technology was sug- 
gested to offer a valuable tool for the mentoring 
program. 

Other reported positive aspects of this novel 
approach include: 

• Easy way to communicate 

• Minimal effort required 

• You can do it at any time and you don’ t miss the 
person 

• (The phone is) very popular with people, so it is 
an advantage to use it with mentoring, 

• Quicker, more efficient, because people have 
their phone on them more than they check their 
emails 

• Teaches responsibility, taking care of the phone 



More informative, a picture can say a thou- 
sands words. 



u 



As a weakness, mentees referred to the short 
length of the experimental period, and were suggest- 
ing it to be elongated from one to, for example, two 
weeks. This was argued by explaining that it would 
offer a longer period of time to get used to both using 
the technology (the MMS phone) and the mode of 
communication. A wish for a longer lasting experi- 
mental period was also mentioned by mentors, as 
one week was not considered to be a long enough 
time to cover the different aspects related to the 
diverse work in IT. The suggestion of a longer 
experimental period is also supported when examin- 
ing the messages, as the communication become 
relaxed towards the end of the week. This feedback 
from participants in 2003 was integrated into the 
program run in 2004, as longer experimental periods 
(1 week/ 10 working days) are used. 

The technical barriers noted were battery time/ 
length and lack of network coverage in some areas, 
which were described to limit the communication in 
some instances. These, in addition to small screen 
size, low bandwidths, limited storage and cumber- 
some input facilities are common barriers that have 
been presented in the literature (Chae & Kim, 2003; 
Tarasewich, Nickerson, & Warke, 2001). However, 
the participants’ perception was that the technical 
barriers did not have significant impact to the overall 
experiment. A criticism from mentors was that if the 
use of mobile communications would be the only 
medium of interaction in mentoring, the conversa- 
tions between mentor and a mentee would remain 
too light and no deep knowledge or “big picture” 
would be obtained from the short communications. 
For instance, the following comment was obtained 
from a mentor when asked the weaknesses of 
mobile mentoring: 



• Mentor: “The nature of conversations differs 
a great deal in comparison to ones had in face- 
to-face meetings and e-mail exchange, where 
the discussion is held for longer. However, I 
highly recommend this system as an additional 
part of communication, as it offers a possibil- 
ity for more intense and frequent interaction 
and highlights the aspects which otherwise 
hardly were considered, e.g., the work envi- 
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ronment, task descriptions and the time sched- 
ule of the day.” 

FUTURE TRENDS 

When the mobile mentoring program first began in 
early 2003, the number of multimedia messaging 
capable phones was minimal. It is expected that 
when they become more common, mobile mentoring 
can be adopted on a larger scale, and it may come a 
natural part of the interaction process. However, as 
a starting point, it was important to lend out the MMS 
capable phones and get people actively involved in 
the process. 

In 2004, yet another novel approach to mentoring 
was added to the ongoing Alumni Association 
mentoring program in the form of international 
mentoring. In addition to a local mentor to commu- 
nicate with (either traditionally and/orphone), volun- 
teering student mentees were given contact persons 
working abroad as mentors. The mentoring contacts 
were obtained through the university department’s 
connections to the international IT industry. The 
communication employs mainly e-mail, but also ad- 
ditional mobile messaging techniques. Due to diffi- 
culties in connectivity between mobile phone opera- 
tors, the MMS between two phones was found 
inoperative, thus picture messages were exchanged 
by sending a MMS from a phone to e-mail. Initial 
results of this additional experiment will be reported. 

CONCLUSION 

This article introduces how mobile communication 
technology has been embedded into a university 
student mentoring program which was held among 
first-year information technology students within an 
Australian university. The study was implemented 
by giving mentor-mentee pairs mobile phones with 
MMS functionality. Participants communicated with 
each other over a one-week period. Participants were 
advised to incorporate visual information into the 
communication by a form of multimedia messaging. 

Although effective in many situations, mentoring 
can be rather unproductive and thus unsuccessful 
for many reasons. One common reason for failure is 
a lack of structure. Many communications between 



mentor and mentee are adhoc and generally un- 
planned which can and often does result in long 
periods between communications. Lack of structure 
can also distort perceptions of outcomes and results, 
with no clear aim being achieved. 

The results show that integrating mobile commu- 
nication into the mentoring process has provided 
added value to the traditional program. Participants 
suggest that it enhances the mentoring experience 
and that it can be regarded as a valuable tool in 
communications between the mentor and mentee. 
Positive aspects of the program were identified as 
increased frequency and flexibility in communica- 
tion, which are highly valued because of the time 
constraints of both mentor and mentee. Mentors 
emphasised also the easy access and speed of use, 
as sending a message with mobile phone was re- 
garded as more easy and flexible than e-mail, which 
took more time and was limited to the work situations 
and a computer. Both parties reported on the devel- 
opment of a deeper personal relationship and re- 
laxed communication between mentor and mentee 
over time. Including visual information to the com- 
munication in a form of MMS, new aspects of both 
mentor’s work and her/his lifestyle were highlighted. 

Generally, mobile communication technology was 
found to offer a valuable tool for a mentoring pro- 
gram as a supporting tool of communication, even 
though some weaknesses were identified. The re- 
sults have encouraged the authors to continue the 
integration of mobile communication into mentoring 
to enhance the information exchanged between 
mentor and mentee. Continuing and future research 
in this area includes the continuation of the study 
using both MMS and conventional styles, with im- 
provements according to feedback received from 
this pilot phase concerning issues such as the time 
period devoted to the experiment. The aim is to 
increase the amount of participants from a relatively 
small sample to larger numbers of mentee-mentor 
pairs and to attempt to better measure the benefits of 
the program. 
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KEY TERMS 

Mentee: Participant of the mentoring program; 
student or equivalent; being “advised.” 

Mentor: Participant of mentoring program; the 
“advisor.” 

Mentoring Program: A process where a mentee 
is given a personal guide, a mentor, who has profes- 
sional or otherwise advanced experience and can 
advise the mentee on the specifics about the particu- 
lar field of study and work in the industry. 

Mobile Communication Technology: A me- 
dium to communicate via mobile devices. 

Mobile Mentoring: Mentoring which uses 
mobile communication technology as an integrated 
part of the communication between mentor and 
mentee. 

Multimedia Messaging Service (MMS): A 

form of mobile communication, where each mes- 
sage can contain picture, audio, video, and text 
material with certain data size limitations. A multi- 
media message is typically sent from one camera 
phone to another. 

Short Message Service (SMS): A form of 
mobile communication, where mobile phone user is 
able to send and receive text messages typically 
limited to 160 characters. A short message is typi- 
cally sent from one mobile phone to another. 
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INTRODUCTION 

As Kress and Van Leeuwen (2001) state, there is no 
communication without interaction. Broadly, levels 
of “interactivity” can be recognized as depending on 
quality of feedback and control and exchange of 
discourse according to the mode or modes 
(“multimodal discourse”) involved. Important con- 
straints that operate to modify interactivity of any 
kind can be identified as the amount of “common 
ground” (Clark, 1996), constraints of space and 
time, relative embodiment, and choice of or control 
over the means, manner, and/or medium of feed- 
back. 

Ha and James (1998) emphasize the element of 
response as characterized by playfulness, choice, 
connectedness, information collection, and recipro- 
cal communication. 



BACKGROUND: SELECTED 
ELEMENTS OF DIGITAL 
INTERACTIVITY 

Feedback 

Any evaluation of feedback, as defined by Kiousis 
(2002), should take into account various factors. For 
example, feedback should not be just two-way, but 
should encompass several different avenues and 
facets of expression; it can be linear and/or non- 
linear. Hyperlinks should offer the element of choice, 
and the ability to modify the mediated environment 
must exist. Individual perception of interactivity 
depends on the quality of media (form, content, 
structure, relation to user) but also on “social pres- 
ence” (Short, Williams, & Christie, 1976) or 
“telepresence” (awareness of mediated environ- 



ment), perceived speed, timing, and flexibility. Kiousis 
adds to these factors the concepts of “proximity” — 
how “near” the user feels — and “sensory activa- 
tion” — the involvement of the user’s senses. 

Immersion and Engagement 

The qualities of “immersion” and “engagement,” 
referred to by Douglas and Hargadon (2000) as 
“The Pleasure Principle” and equated by Laurel 
(1993) with the “willing suspension of disbelief,” 
appear to be crucial in creating the illusion of inter- 
action. 

The role of immersion and engagement is obvious 
with reference to simulations, the use of links, and 
user perception of control and decision-making. 

Simulation 

Simulation (particularly as in Game format) privi- 
leges a sensation of control, a sense of presence, and 
entry into mediated environments as “active” rather 
than “passive” through manipulating time (speed 
involved in decision making), agency, the spatial 
orientation of the user, and what Darley (2000) 
describes as “vicarious kinaesthesia:” the feeling of 
“direct physical involvement” (p. 157). Perhaps we 
might add to this list the element of “surprise,” the 
“unexpected,” the apparently random, necessitat- 
ing a response and therefore creating an impression 
of responsive dialogue and mutual discourse, a per- 
ception of feedback and engagement. 

Play 

In all questions of interactivity, the target audience 
must be considered (McMillan, 2002), and the nature 
of links must be examined. Manovich (2001) com- 
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plains that by following “pre-programmed, objec- 
tively existing associations,” users of interactive 
media are being asked to mistake the structure of 
somebody else’s mind for their own (p. 61). 

One of the characteristics of interactivity is the 
nature of “play” involved. The importance of play in 
performing identity and social structure has long 
been recognized (Huizinga, 1955), and, as Zimmerman 
(2004) has more recently noted, play both expresses 
and simultaneously resists the structure of the sys- 
tem within which it exists. Within any interactive 
system, this element of play could perhaps be seen 
as a crucial factor in removing the impression of a 
predictable structure, which stifles user individuality 
and involvement. Although choices, or links, are 
indeed programmed, there can be no play without 
constraints; games always have “rules” that cannot 
be changed without creating a different “game” 
(unless, of course, this is a device of the game 
creator to produce engagement and thus reinforce 
the nature and structure of the game!) 

This consistency of “world” or “play” further 
contributes to the “willing suspension of disbelief’. 
As Douglas (2000) remarks, ambiguity is always 
embedded in the interactive, but this ambiguity can 
be harnessed in service to the sense of play, which 
of itself both provides and subverts the structural 
framework. 

Hypertext: Interactivity as Narrative 
and/or Drama 

No consideration of digital interactivity is possible 
without a discussion of interactive hypertext, often 
characterized as “multidimensional.” It is necessary 
to remember that multidimensional does not mean 
“random explorations,” but what Douglas (2000) 
calls “polysequential” rather than Nelson’s “non- 
sequential” writing (Nelson, 1992), or even Bush’s 
1945 “encyclopedia of associative trails” for Memex 
(Bush, 1992), for in such an “encyclopedia,” al- 
though the associations of the reader will be used to 
construct individual unique meaning or personal 
narrative, the “encyclopedia” has not necessarily 
been structured for this purpose by the author; this 
is the difference between constructed narrative and 
information retrieval. 



Multidimensional hypertext at its best takes ad- 
vantage of and exploits the human tendency to 
construct narratives to make sense of the world, 
relying on individual human selection of appropriate 
stimuli and human ability not simply to choose links 
but to create connections, rather than simply follow- 
ing pre-ordained paths. Joyce (1995) remarks that 
the user/reader’s task is to make meaning by per- 
ceiving order in space, so that the meaning is orderly 
but there is a continual replacement of meaningful 
structures throughout the text: the narrative is con- 
stantly evolving in time and space. 

Murray (1997) identifies three qualities (which 
she calls “pleasures”) that characterize the interac- 
tive audience: immersion, agency, and transforma- 
tion. Immersion, meaning engagement of the imagi- 
nation and the senses, has already been discussed as 
a property of interactivity. Murray emphasizes the 
active audience and differentiates between the role 
of the interactive user/reader and the role of the 
author by describing the user/reader as agent. Her 
emphasis on various points of view as one technique 
for incorporating multi-sequencing in hypertext is 
typical of a narrative approach. 

An alternative approach is that of Laurel (1993), 
who suggests drama as a model for interactivity, and 
emphasizes three features: 



V 



1. Enactment (to act out) rather than to read. 
Narrative is description; drama is action. 

2. Intensification, incidents are selected, arranged, 
and represented to intensify emotion and con- 
dense time. 

3. Unity of action versus episodic structure. In 
the narrative, incidents tend to be connected by 
theme rather than by cause to the whole; in 
drama, there is a strong central action with 
separate incidents causally linked to that ac- 
tion. Drama is thus more intense and economi- 
cal. 

When Laurel advocates strategies for designing 
interactive media, she emphasizes that the concep- 
tual structure should encourage the potential for 
action. Laurel outlines several key points for design- 
ing interactive media, and emphasizes that tight 
linkage between visual, kinesthetic, and auditory 
modalities is the key to immersion. 
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ENHANCING INTERACTION: 
CREATIVE LINKING AND 
INTERACTIVE SPACE 

Link Authoring 

Every interface asks the audience to participate in its 
construction, and creative link authoring is one of the 
most important factors determining whether the au- 
dience will perceive this interface as interactive. 

Early on, Nelson (1992) proposed different simple 
“styles” of guiding the sequencing of hypertext: 
planned variations, which focus on the transmission 
of a message, representing interconnections, repre- 
senting the structure of the subject for the reader to 
explore. Golovchinsky and Marshall (2000) point out 
that the quality and quantity of the reader’s choices 
are confined by the fixity of the links and that the 
“trick” of creating interactive hypertexts is to subvert 
this “fixity.” Choices as to the use of fixed links, 
variation of links, query-mediated links, provide a 
“hidden” structure, which conditions the audience’s 
choices and reactions to the text as well as the level 
of perceived interaction. Further, linking 
“reconfigures” the text and is crucial to creating the 
placement in space, which gives the text its multidi- 
mensional aspect and “aligns” and “realigns” mean- 
ing, both visual and verbal. As Garrand (1997) re- 
marks, there must be a balance between the viewer’ s 
freedom and narrative coherence (the constraints of 
the game further the sense of play!), and subtle and 
appropriate linking creates that balance. Garrand, 
writing with reference to interactive multimedia, 
emphasizes that linking for interaction must be “ver- 
tical” as well as “horizontal,” that interactive writing 
is 3-D writing. 

Links both emphasize the visual element of the 
text itself — the text as a visual feature — and help to 
create an “enactment” of three dimensional space in 
the spatial relations of “navigation” (up/down, left/ 
right, etc.) and in the impression of “layering.” In 
hypermedia link authoring, where hypertext is linked 
with images, videos, sounds, animations, and so forth, 
linking makes clear that verbal text is only one kind of 
content, and that a link does not just “match” verbal 
text, sound, image, and so forth, but reveals content 
from different perspectives. Although links used in 
the course of interactive exploration can give the 



impression of what Douglas (2000) refers to as an 
“unlimited database,” too much detail and too many 
links detract from immersion. 

Interactive Space: Visual and Verbal 

One aspect of digital interactivity is about creating 
the impression of the enactment of an infinite 
possibility of sequencing through creative linking; 
structure and content are formed by and equated 
with space “traveled.” The physical action of “click- 
ing” to select links is combined with the mental 
action of “connecting” links; both serve to structure 
and layer digital space, and to produce the sensation 
of movement through space. As noted before, users 
do not visualize themselves traveling up and down 
a line, or even back and forth on branching lines to 
and from a center of meaning, but navigating through 
3D space. Further, identification of what might be 
considered as being “inside” or “outside” the text 
loses meaning and importance. This “virtual” space 
is self-contained but through linking and association 
can contain more than the “sum of its parts.” 

As Wertheim (1999) has remarked, the frescoes 
of Giotto in the Arena Chapel of Padua (1305) 
provide a visual parallel and enactment of this kind 
of Memory Palace, and also a precedent for the 
layering of meaning in space, which has come to be 
seen as characteristic. 

Livingstone ( 1 999) also points out that the physi- 
cal movement of the human agent (in clicking, 
choosing paths, etc.) manipulates objects, which 
exist only in digital space, as if they existed within 
physical space. He compares this to Lakoff and 
Johnson’s “embodied interaction” (1980), and, as 
we “drag” objects onto and around the screen, the 
conceptual relationships we make between the real 
and the digital form the foundations of a com- 
pletely new interactive space with its own specific 
characteristics, and its own formulae for conveying 
meaning. 

“Paths” of reading are also important for the 
creation of interactive space. As Kress (2003) 
makes clear, reading paths are culturally dictated 
(left to right/ right to left, etc.). “Multimodal” texts 
open the question of reading paths — in terms of 
“directionality” (which direction?) and in terms of 
which elements the reader chooses as “points” 
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along the reading path. What are the elements to be 
read together? (Just as children learning to read do 
not make the assumptions about “ordered” reading 
space that trained adults do.) Is the reader looking at 
a text to be “read” as a conventional text, a text to 
be “read” as an image, an image to be “read” as part 
of a text? Thus the “reading” of an interactive 
verbal/visual text “screen” implies that the reader 
establish the order through his/her own preferences 
as to relevance, thereby constructing a personalized 
meaningful space. 

The creation of interactive multimodal discourse 
thus demands that authors and designers consider 
carefully the interplay between visual and verbal 
units of meaning and their placement, not simply in 
terms of the space of the screen, but in terms of the 
relative value of that space, and how juxtaposition in 
that space affects the relative values of text and 
image. Not only do text and image provide different 
possibilities for the creation of meaning and “en- 
gagement,” but verbal text on-screen becomes an- 
other aspect of the visual (fonts, graphics, visual 
sculpting of blocks of text, layout, etc.) — and this 
should be taken into account by creators to capitalize 
on capacity for interactivity. 

De Certeau (1988) suggests that, through the 
“spatial practice” of walking, the pedestrian learns 
to create and inhabit his own city by the paths he 
chooses. A similar creation of personal space in 
virtual space is important for immersion and engage- 
ment, which is why Johnson-Sheehan and Baehr 
(2001) place such importance on the use of “design 
metaphors” — architectural, physical spaces such as 
cafes, museums, and so forth, to involve the user 
physically — and why the use of visual features 
(frames, icons, images) to create possibilities for the 
navigator, rather than simply as “dead” links, is also 
relevant to user perception of screen space as 
interactive space. (Laine, 2002) 

Darley (2000) has proposed that the interactive 
element of visual digital culture is best thought of as 
related to earlier forms of entertainment — like the 
amusement park, or music hall for example, which 
demand active participation from the audience — 
rather than more contemporary media, like television 
or cinema. This comparison highlights an aspect of 
visual digital interactivity which often is not consid- 
ered adequately because it is so obvious — the screen 



is not a television, not only in the aspect of viewer 
control or “interactivity”, but also in the way that 
images are presented, sequenced, used, and “val- 
ued.” 



V 



FUTURE TRENDS 

An increased implementation of techniques to en- 
hance the impression of interactivity is important for 
every aspect of digital media. Some interesting 
future applications include “Peer-to-Peer Commu- 
nications/Visualizing Community” (Burnett, 2004), 
design practice in humanities-based applications 
(Strain & VanHoosier-Carey, 2003), and the field of 
interaction design as a whole. As Lowgren (2002) 
remarks: “Interaction Design is a fairly recent 
concept. . .It clearly owes part of its heritage to HCI, 
even though the turns within established design 
fields — such as graphic design, product design and 
architecture — towards the digital material are every 
bit as important.” Further, as McCullough (2004) 
notes, “the goal of natural interaction drives the 
movement toward pervasive computing and embed- 
ded systems” (p. 70). 

Techniques of narrative characteristic of inter- 
active hypertext are being exploited to increase user 
involvement in a variety of commercial and web 
applications (Broden, Gallagher, & Woytek, 2004). 

The digital design identity of corporations and 
brands offers another area for future application. 
McCullough (2004) has underlined the prospective 
value of interactive media for developing new rela- 
tionships between the brand and the market, and 
particularly emphasized the expected future diversi- 
fication of interactive systems by digital brands and 
services as a way of manifesting and performing 
brand identity. 



CONCLUSION 

As Aarseth (2003) suggests, “attempts to clarify 
what interactivity means should start by acknowl- 
edging that the term’ s meaning is constantly shifting 
and probably without descriptive power and then try 
to argue why we need it, in spite of this” (p. 426). We 
need interactivity and all the various points of view 
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that coexist within the shifting meaning of this term 
because successful interaction transforms the pas- 
sive receiver of information into the active partici- 
pant in communication. 
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KEY TERMS 

Common Ground: Shared knowledge and ex- 
perience common to both sender and receiver. This 
“common ground” enables the references and con- 
text of the message to be deciphered successfully 
and meaning to be communicated. 

Digital Interactivity: Despite the fact that 
interactivity as a blanket concept cannot be pre- 



cisely defined, the quality of interactivity defined by 
the user generally depends on the amount of “com- 
mon ground”, the user’s perceived ability to control 
and influence form and content of the mediated 
environment, to be “engaged” in mediated space (in 
terms of belief and/or in terms of sensory stimulation 
or displaced physical enactment or embodiment), 
and to participate in multidimensional feedback which 
offers choice in real time. 
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Hypertext: Text (and we use the term here in 
the broad sense to include “text” that may be verbal 
and/or visual) which is constructed as 
“polysequential” (Douglas, 2000) and multidimen- 
sional through a network of associational links. 

Interaction Design: “There is no commonly 
agreed definition of interaction design; most people 
in the field, however, would probably subscribe to a 
general orientation towards shaping software, Web 
sites, video games and other digital artefacts, with 
particular attention to the qualities of the experi- 
ences they provide to users” (Lowgren, 2002). 

Multimodal Discourse: Discourses are “so- 
cially situated forms of knowledge about (aspects 
of) reality. This includes knowledge of the events 
constituting that reality. ..as well as a set of related 
evaluations, purposes, interpretations and legitima- 
tions.” Modes are “semiotic resources which allow 
the simultaneous realization of discourses and types 
of (inter)action... Modes can be realized in more 
than one production medium. Narrative is a mode 
because it allows discourses to be formulated in 
particular ways. ..because it constitutes a particular 
kind of interaction, and because it can be realized in 
a range of different media” (Kress & Van Leeu wen, 
2001, pp- 20-22). 

Telepresence: Telepresence has been suc- 
cessfully achieved when the mediated environment 
is perceived by the user as having similar “presence” 
and importance as the physical environment (Kiousis, 
2002 ). 

Vicarious Kinaesthesia: The dimension of di- 
rect physical involvement which gives the user in a 
mediated environment the impression of agency, of 
controlling events that are taking place in the present 
(Darley, 2000, p. 157). 
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INTRODUCTION 

Over the last three decades and, above all, during the 
last few years, advances in areas that have been 
crucial for the success of the now multi-billion-dollar 
computer and video game industry (in particular, 
those of graphics and gameplay complexity) have 
been nothing short of breathtaking. Present-day 
console games run on machines offering quite re- 
markable possibilities to game developers. Their 
stylish presentation and compelling interactivity con- 
tinue to set exceedingly high standards to which 
many serious applications running on desktop com- 
puters can only aspire. In spite of their adolescent 
image, games (particularly, console games) have 
continually raised general computer-user expecta- 
tions. 



BACKGROUND 

In August 2004, 1 28-bit consoles (Playstation2, Xbox, 
Gamecube) were approaching the end of their prod- 
uct lifecycles and were due to be replaced by 256- 
bit systems. It is inevitable that games for the new 



machines will offer even greater sophistication in 
their user interfaces, especially with respect to 
graphics. It is not surprising, then, that interest in this 
area is intensifying, not only within the games devel- 
opment community (as evidenced in dedicated Web- 
based resources for game design, such as those at 
Gamasutra — www.gamasutra.com) but academia 
with the increase in the number of universities 
delivering game-design courses paralleling the grow- 
ing quantity of research devoted to the topic. It is 
also in the field of on-screen visual interface, as 
opposed to physical interface (hardware such as the 
now common joypad games controllers), that most 
progress has been made and on which most research 
currently is centered. 

Visual Interface 

Screen displays have improved beyond recognition 
since the dawn of commercially available computer 
games in the 1970s. Spacewar (Figure 1), released 
in 1962 for the PDP- 1 mainframe computer, often is 
referred to as the first graphics-based computer 
game, but it was not until the advent of Atari’s Pong 
in 1975 (Figure 2) that computer games entered the 



Figure 1. Spacewar (1962 - PDP-1) 
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home, and the real computer-games industry began. 
The visual interface of Spacewar nonetheless typi- 
fied that of the 1960s and 1970s in both its graphical 
simplicity and the undemanding nature of the user 
control it offered; gamers had only four options — 
rotate left, rotate right, thrust, and shoot. Still less 
advanced, even given the 13-year age gap, was the 
interface of Pong — players merely moved a block 
of pixels up and down; the block was supposed to 
represent a table-tennis bat that sent a square “ball” 
to the other side of the screen at an angle determined 
by the position of the bat and the previous stroke. 
Jump forward in time to 1989, and there was some- 
thing of a transformation in the norm for the games 
interface. Super Mario World 3 (Figure 3) charac- 
terized games of its period with its basic two- 



Figure 3. Super Mario World 3 (1989 - Nintendo 
Entertainment System) 




dimensional platform-style visuals, but the interface 
was, in fact, far more sophisticated than the games 
of the 1970s and early 1980s; in addition to colour, it 
offered dynamic on-screen textual information, in- 
cluding options between stages and more complex 
controls. The visual interfaces of the current gen- 
eration of games have taken on even greater com- 
plexity, as exemplified by Mario Sunshine (Figure 
4), where 3D rendering, a wide array of controls, and 
changes in visual perspectives (e.g., from first- to 
third-person and 360-degree camera angles) are the 
order of the day. Thus, whereas a quarter of a 
century ago, even inexperienced gamers were able 
to play a game to the maximum of what it had to offer 
with barely any learning involved, this is no longer 
the case with most of today’s games, given the 
degree of familiarity required for understanding and 
making full use of a typical game’ s interface. This is 
particularly true of the strategy, simulation, and role- 
play genres, where the emphasis on information 
management necessitates an intricate visual inter- 
face (see Figure 5 for one example). 

Yet, despite the extent of the revolution in visual 
interfaces, game designers still need to adhere to 
certain conventions (Poole, 2000). While the task of 
designing effective interfaces in most cases is linked 
inevitably to the type of game being developed, 
research has shown that the design of any game 
generally considered to be good tends to conform to 
a fixed pattern (Cousins, 2003; Fabricatore et al, 
2002; Ip & Jacobs, 2004). Accordingly, even with all 
the opportunities offered by today’s hardware, there 
is not as much freedom in design as might be 
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Figure 4. Mario Sunshine (2002 - Nintendo 
Gamecube) 




Figure 5. Warcraft 3: Frozen Throne (2003 
PC) 
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imagined. Studies by academic researchers such as 
Taylor (2002) and Warren (2003) and those by hands- 
on games designers such as Dalmau (1999) and 
Caminos and Stellmach (2004) have shown convinc- 
ingly that good visual interface design is now a 
prerequisite for those attempting to develop a truly 
immersive gaming experience, but that using the full 
power of the hardware does not in itself lead to 
effective design. Johnson and Wiles (2001) have 
proposed that the most successful games are those 
whose user interfaces invoke a deep sense of con- 
centration, enjoyment, and absorption, whether or not 
they are innovative. Caminos and Stellmach (2004) 
discuss the issues surrounding the development of an 
intuitive user interface for games, noting that, despite 
what may appear to be simple design problems, 
getting the basic interface right is extraordinarily 
difficult. However, the full capabilities of the hard- 
ware cannot be ignored, of course. In addition to 
what one might call the conventional aspects of 
visual-interface design (basic screen layout, menu 
design, etc.), modern game designers must also take 
account of more complex possibilities, such as the 
point-of-view (camera angle) delivered to the gamer 
as smoothly as possible, or realistic graphical effects 
(Adams, 1999; Poole, 2000; Schell & Shochet, 2001). 
Taylor (2002) has examined the delicate relationship 
between first- and third-person perspectives and 
how these influence gameplay , while Federoff (2002) 
has proposed various methods for the evaluation of a 
game’s playability based on its visual appearance. 

Physical Interface 

Research on game user interfaces so far has con- 
verged above all on visuals. This is somewhat sur- 
prising, since while for the majority of non-game 
computer applications the hardware interface con- 
sists of nothing more than a keyboard and a mouse, 
the present generation of computer and video games 
can benefit significantly from a broader range of 
peripherals. In spite of this, hardware games inter- 
faces have changed far less over the years than their 
on-screen counterpart. 

The most common devices for domestic gaming, 
as defined by the standard mode of interaction with 
the game, are joypads, joysticks, and keyboards, 
which typically are bundled with the initial purchase 
of the hardware. Many other optional devices have 



also been used over the past two decades, and items 
such as steering wheels and light guns have been 
fairly popular for use with appropriate games. Re- 
cently, dance mats and pressure pads, which enable 
users to interact with their entire bodies, have also 
been gaining some ground. Peripherals designed for 
the arcade market include innovative hardware 
interfaces, such as skiing platforms and driving or 
flying simulators, but the high cost of producing 
these as well as the bulkiness of the cabinets they 
use have more or less confined them to their ar- 
cades; there has been little or no transference to the 
domestic market. 

Nevertheless, there have been numerous at- 
tempts to introduce elaborate peripherals into the 
home market, some of which have been spectacular 
failures. The promise of virtual-reality (VR) hard- 
ware, head-mounted displays above all, which in the 
1990s showed all the signs of offering the ultimate 
gaming experience, has remained largely unful- 
filled. Head-mounted displays fell short of expecta- 
tions in a number of ways, including comparatively 
poor graphic resolution and their tendency to cause 
so-called VR sickness, the equivalent of motion 
sickness (Adams, 1998; Edge, 1999; Vaughan, 1999), 
not to mention the fact that players generally do not 
want to wear cumbersome equipment (Hecker, 
1999). Thus, VR has not yet been able to establish 
a significant foothold in the game industry, whereas 
over a relatively long period of time, there have 
been numerous successful applications of it in other 
fields (Kalawsky, 1993). Two other examples of 
advanced physical interfaces that have not achieved 
widespread use are Mattel’s Powerglove for the 
Nintendo Entertainment System (NES) and 
Nintendo’s Virtual Boy. The Powerglove, intro- 
duced in 1989, was a potentially winning VR data 
glove that could track hand motion in three dimen- 
sions; it was soon withdrawn because of serious 
technical problems (Gardner, 1989). Virtual Boy 
was a stereoscopic device that claimed on its re- 
lease in 1995 to be ushering in a new era of video 
games. It failed badly as a result of a rather clumsy 
design and relatively poor interaction possibilities 
(Herman, 1997). 

Conventional joypads, keyboards, and mice, then, 
persist as the dominant physical interfaces. As can 
be seen in Table 1 (a comparison among the most 
popular platforms and their hardware interfaces 
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since the 1970s), in spite of the availability of a wide 
selection of peripherals, the principal physical inter- 
face for console games has remained the joypad and, 
for computer games, the keyboard and mouse. In- 
deed, such is the emphasis on these interfaces, that 
even titles that would appear to be ideally suited to 
genre-specific devices (i.e., driving and shooting 
games) are designed primarily with joypads and 
keyboards in mind, and comparatively common ad- 
ditional interfaces such as steering wheels and light 
guns are still seen largely as novelty items. Evident 
in Table 1 also is that the number of buttons on 
joypads has increased steadily with each new gen- 
eration since the 1970s (except between generations 
4 and 5, and in the case of the Nintendo Gamecube, 
the number of buttons actually decreased from the 



previous N64 machine). Yet the increase in the 
number of buttons aside, it is clear that the predomi- 
nant physical interface (the joypad for console games, 
the keyboard and mouse for the desktop or laptop 
computer) has actually altered little since the 1970s. 



V 



FUTURE TRENDS 

The fact that over the years there has been little 
change in standard physical game interfaces is now 
being taken seriously by manufacturers, because it 
may be only a matter of time before joypads, key- 
boards, and mice simply will no longer fit the bill. 
Hence, we are seeing developments such as Sony’s 
EyeToy (in which a camera attached to Playstation 



Table 1. Most popular physical interfaces of game platforms 



Generation 

/decade 


Platform 


Standard 

interface 


Common optional 
interfaces 


l/1970s 


Magnavox 

Odyssey 


Analogue dial 
controller 


/ 


l/1970s 


Atari 

VCS/2600 


Single-button 

joystick 


/ 


2/1 980s 


Nintendo 

NES 


Two-button joypad 


Joystick, infra-red gun 


2/1 980s 


Sega Master 
System 


Two-button joypad 


Joystick, infra-red gun 


2/1 980s 


NEC 

PC-Engine 


Two-button joypad 


/ 


3/1980s 


Sega 

Megadrive 


Three-button joypad 


Joystick, six-button 
joypad, infra-red gun 


3/1 990s 


Nintendo 
Super NES 


Six- button joypad 


Joystick, infra-red gun 


3/1 990s 


SNK 
Neo Geo 


Four-button joystick 


Joypad 


4/ 1990s 


Playstation 1 


8-button joypad 


Infra-red gun, steering 
wheel, dance mat 


4/1990s 


Sega Saturn 


8-button joypad 


Infra-red gun, steering 
wheel 


4/1990s 


Nintendo 

N64 


9-button joypad 


/ 


5/late 1990s 
to present 


Playstation2 


8-button joypad 
with additional 
analogue sticks and 
built-in rumble 


Infra-red gun, steering 
wheel, dance mat, 
EyeToy 


5/late 1990s 
to present 


Xbox 


8-button joypad 
with additional 
analogue sticks and 
built-in rumble 


Infra-red gun, steering 
wheel, dance mat 


5/late 1990s 
to present 


Gamecube 


7-button joypad 
with additional 
analogue sticks and 
built-in rumble 


Steering wheel 


1970s to 
present 


PC 


Keyboard, mouse 


Joypad, joystick, 
steering wheels/flight 
simulation controllers, 
speech recognition 
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2 allows the player to be immersed in a game by 
becoming a character on the screen) and hands-free 
gaming devices based on relatively simple Web 
cams that can be used as a substitute for the mouse 
(Gorodnichy & Roth, 2004). Further, despite the 
obstacles presented by VR, research is on the 
increase into the possibilities for domestic use of- 
fered by immersive VR-type full-body interaction 
(Warren, 2003). One of the most intriguing among 
the emerging ideas is an affective interface that has 
been described as “computing that relates to, arises 
from, or deliberately influences emotions” (Picard, 
1997, p. 3). A study conducted by Scheirer et al 
(2002) has demonstrated how user emotions can be 
measured and taken into account in order to facili- 
tate the design of the user interface and, even more 
importantly, to enable certain factors (e.g., screen 
layout or number of button presses before the next 
option is presented) to respond by making changes in 
real time that depend on the behavior of the user 
(e.g., when frustration is detected). It may be that 
such developments will be necessary in order for the 
videogame industry to continue to flourish, since, 
even with the advent of 256-bit machines capable of 
photo-realistic 3D graphics and fully controllable, 
interactive, seamless motion video, improvements in 
the visual interface alone may prove insufficient to 
ward off the player dissatisfaction, which, since the 
early 1990s and in the face of all the success, has 
been a central factor in holding back what might 
have been an even greater market volatility (Edge, 
1993, 2002, 2003, 2004). Indeed, it may be that the 
very existence of ever-more advanced visual inter- 
faces, if unaccompanied by parallel developments in 
physical interfaces, may cause such a disparity 
between the two that design creativity will suffer, 
while the market witnesses even greater consumer 
resistance than has been the case so far. 



CONCLUSION 

We have seen that the visual interface for games has 
come a long way in terms of both capabilities and 
design complexity, while the physical interface has 
not kept pace. Of course, it may be that even if they 
lack realism, the joypad controllers of Gamecube, 
Playstation, Xbox are the most natural command 
mechanisms for games requiring users to move 



objects or characters around a virtual space and to 
make them perform actions. In driving games, a 
steering wheel and pedals; in first-person shooters, 
a gun; in golf games, a golf club; and so forth, would 
seem to be obvious standard replacements for cur- 
rent controllers, but in practice, joypads have proven 
so far to retain a solidly entrenched position as the 
dominant vehicle of interaction between player and 
game. This is partly because many other peripherals 
have not reached a stage at which they can be used 
with the ease and accuracy of a joypad (drivers of 
real cars find, for example, that even in the most 
expensive games, steering wheels do not behave like 
real steering wheels, even when properly calibrated). 
In any event, while the joypad can be used across 
many different types of game, specialized peripher- 
als are self-evidently unsuitable for all types . In fact, 
the joypad has become so much a part of the 
videogame culture that it well may persist in more or 
less its present form, regardless of future visual- 
interface developments, while the keyboard and the 
mouse, however clumsy they may be when used for 
PC games, so far have shown considerable resis- 
tance even against the joystick, notwithstanding the 
low cost of the latter and specific joystick ports on 
many desktop computers. 

On the other hand, the recent spate of releases of 
compact and relatively inexpensive interactive con- 
trol devices (e.g., the motion-sensitive EyeToy , head- 
phones, and microphones offering voice commands 
and audio feedback; pressure pads for skateboard- 
ing and snowboarding games; increasingly accurate 
light guns with long reach; and infrared pads for 
football games, which recognize shooting, tackles, 
and so on) still may see substantial growth as they 
break free of the confines of arcades. Needless to 
say, the obstacle to mass sales, which is the speci- 
ficity of such devices, always will remain, but as 
prices fall and as visual interfaces make greater 
demands on their corresponding controllers, so the 
all-purpose joypad eventually may become more of 
a secondary device than it is at present. 

If this does not prove to be the case, one wonders 
how joypads, keyboards, and mice will be able to 
cope with the coming advances in visual interfaces 
and to what extent we shall see consumer resistance 
as a result. That said, who could have predicted a 
few years ago that Nintendo’s Gameboy, with its 
diminutive screen and limited control functions (let 
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alone mobile phones with their even smaller screens 
and more primitive control functions) would have 
been so well received by so many consumers as 
hand-held games devices? 
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KEY TERMS 

Computer Game: An interactive game played 
on a computer. 

First-Person Perspective: The visualization 
of the gaming environment through the eyes of the 
character. 
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Game(s) User Interface: Elements and de- 
vices through which the user interacts with the 
game. 

Joypad: A palm-sized device designed for use 
with both hands to interact with the game. Its layout 
is typified by directional keys on the left and buttons 
on the right and top sections of the pad. Modern pads 
incorporate additional analogue sticks on the left or 
on both the left and right sides. 

Joystick: A 360-degree stick mounted on a 
sturdy platform of buttons used for interacting with 
the game; used predominantly in stand-alone arcade 
machines and early home consoles. 

Light Gun: A device used for shooting games, 
which allows the user to target objects on screen; 



used predominantly in stand-alone arcade machines 
and some home consoles. 

Physical Interface: Tangible devices for inter- 
action with the game. 

Third-Person Perspective: The visualization 
of the gaming environment through an external body 
of the character. 

Videogame: An interactive game played on a 
stand-alone arcade machine or home console. 

Visual Interface: Visual on-screen elements 
that can be altered or that provide information to the 
user during interaction with the game. 
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INTRODUCTION 

European users have eagerly adopted novel forms of 
digital media and related information and communi- 
cations technologies (Stanton, 200 1 ), making them a 
part of their increasingly varied and segmented 
cultures (Brown, Green, & Harper, 2001). For ex- 
ample, the young are active consumers of music, 
videos, movies, and games; businessmen on the 
other hand need more and more working tools and 
applications that enable connectivity when they are 
on the move. A not very dissimilar scenario is 
envisaged on troops in action where work on tactical 
and strategic information and mission management, 
command, and control, including real-time mission 
replanning, are essential. All these users rely on the 
Internet, i-TV, and mobile phones, and they have 
adapted all of these into the fabric of their lifestyles, 
or in short, their mobile life. But, functionality cannot 
be the main driver for design as mobile life is also 
deeply founded upon shared values and worldviews 
of the users, pleasure, enjoyment, culture, safety, 
trust, desire, and so forth (Rheingold, 1993). 

For example, WAP (wireless application proto- 
col) technologies seemed to provide a powerful tool 
to the mobile worker. However, it is well known the 
fraud of WAP mainly due to the scarce usability, 
high usage cost, and inadequate range of the ser- 
vices provided together with intrinsic limitations of 
the device itself (insufficient memory storage, low 
battery autonomy, poor screen resolution, etc. 
[Cereijo Roibas, 2001]). However, some WAP ap- 
plications have been widely used by Italian users. 
The success of this system of applications is due to 
its efficiency, effectiveness, and relevance for some 
specific work purposes. Each of the services will be 
analysed, describing the expected use of each ser- 
vice and the actual use of it by Italian users. 



BACKGROUND 

Many efforts have been devoted to design valuable 
tools for the mobile worker, but so far only a few of 
them have been successful. Surprisingly, most of the 
mobile applications originally designed as work tools 
(chat, message board, etc.) have found a fertile 
market in entertainment. There seem to be two main 
causes of this failure: the lack of usability of the 
applications provided and, above all, the failure to 
create realistic usage scenarios. 

The European Commission (EC) and European 
Space Agency (ESA) jointly set up an expert group 
on collaborative working environments that met for 
the first time in Brussels on May 4, 2004. The expert 
group discussed the vision of next-generation col- 
laborative working environments (NGCWEs). The 
vision drawn by the expert group was that NGCWEs 
will deliver a high quality of experience to cowork- 
ers, and will be based on flexible service components 
and customized to different communities. Mobility, 
interaction among peers (systems and persons), 
utility-like computing capacity and connectivity, 
contextualization and content, security, privacy, and 
trust were among the RTD challenges in nine areas 
identified by the experts. 

FROM THE MOBILE 
ENTERTAINMENT COMMUNITY TO 
THE NOMADIC WORK TEAM 

If there is always an uncertainty about the most 
suitable use of a new service, the case of wireless 
applications is not an exception (Flynn, 2002). As 
will be explained for each case, the following appli- 
cations have been created to work in different 
platforms: WAP, WEB, and SMS (short-message 
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service). They were supposed to find wide use 
within the increasing Italian mobile community for 
entertainment purposes. However, they have been 
used more and more as work tools. Obviously, some 
services such as the multimode chat have had and 
still have a strong use for entertainment purposes 
(“Now There is the ‘Wappario,’ 2000). Users that 
need to communicate with their colleagues in real 
time when they are out of the office are largely using 
mobile chat as a working tool. There is no experi- 
ence of the above-mentioned phenomenon in the 
classical Web-only chat services. 

THE USE OF WAP AND SMS 
APPLICATIONS AS 
WORKING TOOLS 

Who, Where, When, Why, and How 

Supposed target users of multimode applications 
(Burkhardt et al., 2002) were thought to be teenag- 
ers, but, as recent technology history has shown, 
consumers’ behaviour and use of technology have 
contradicted predictions. Mobile technologies, rang- 
ing from WAP to SMS, from GPRS (general packet 
radio service) to MMS (multimedia message ser- 
vice), were born to meet the desire of teens, the 
same people who had made text messaging their 
preferred medium. However, as had happened for 
short messages, multimode applications have been 
used for a purpose that contradicts its unique selling 
proposition, confirming once more the inner limits of 
today’s marketing of new technology. Outside cu- 
bicles, mobility is at the heart of multimode applica- 
tions, allowing users to make a real personal use of 
technology. People stopped being slaves of given 
and prepackaged software: They want technology 
when it shows to be relevant in real life — not a 
utopian Internet where they are living nowhere but 
in wires because without wires, people are free. 
Relevance is the reason for multimode applications : 
People want technology relevant to work, dating, 
participating in TV voting, and chatting. What counts 
is that technology is at their hands when they need it, 
and they are the ones giving meaning to it. It is not 
a chat software waiting for them at a URL (uniform 
resource locator), but it is a person at home or in the 



office with a need and a device that can help satisfy 
it. How these technologies are used is a matter of 
context: There is no optimized path to be imposed on 
end users. Multimode applications impose new chal- 
lenges because end users get more and more de- 
manding. End users do not want technology, but 
services. Technology is going back behind the scenes. 

Mobile Applications for Work and 
Sharing Knowledge 

The services that will be discussed below were 
launched by the Italian mobile-services provider 
HiuGO in early 2001 as part of a Blu-branded 
offering. All applications fully exploited the potenti- 
ality of the mobile medium by combining messaging 
with WAP and later on GPRS browsing. The appli- 
cations have been tested and corrected according to 
the company’s usability standards both before their 
launch and after it as data from consumers were 
collected. Continuous interaction with mobile users 
included the following activities: monitoring usage 
and traffic patterns, controlling services require- 
ments, polling end users’ expectations, analysing 
end users’ interactions (Schneiderman, 1987), and 
checking users’ satisfaction. 

The main methodology consisted of the fol- 
lowing: 

• Users’ perceived value of the services pro- 
vided (questionnaires) 

• Self-training with a quick learning phase (us- 
ability test) 

• Time for task completion (usability test) 

• Number of users’ irreversible mistakes (us- 
ability test) 

• Satisfaction of users’ expectations (question- 
naires) 

• Level of users’ interaction (usability test) 

• Flexibility toward users’ personalization (us- 
ability test and questionnaires) 

Message Board 

A message board is a thematic forum with file- 
sharing options. Users can access the forum via 
SMS or WAP/GPRS. All functionalities are avail- 
able and optimized for mobile devices. Users sub- 
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Figure 1. Message board on an IPAQ display 



Figure 2. Multimode chat on a SmartPhone 
display 
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scribe to one or more forums choosing a nickname 
and password. Once successfully registered, they 
can consult content browsing through topics, publish 
messages, and receive alerts when a topic they are 
interested in is updated simply by setting a keyword 
alert. For example, by sending a text message with 
the following keyword and search parameters, <up- 
date food supplies>, the user will get an alert every 
time a message is sent. The mobile device turns out 
to be essential in getting critical updates. This tool has 
revealed to be ideal also for knowledge sharing 
among working groups (Figure 1). In distance learn- 
ing courses and for nomadic groups of workers and 
journalists, message boards turn out to be a vital 
resource. Journalists who need to move from one 
place to another can keep in touch with each other 
and with the editorial team thanks to closed message 
boards where they can consult last-minute updates 
and information as well as rumours, or send short 
previews of their articles in real time as events are 
happening. 

Multimode Chat 

Multimode chat is a real-time communication service 
that works via SMS (asynchronous), the Internet, 
and WAP/GPRS (synchronous): The same interface 
is accessible from a PC (personal computer) and 
from a mobile phone (each chatter has an icon or 
avatar that evidences the device he or she is using in 
that moment). The chat service is organised in the- 
matic rooms and permits one-to-many and one-to- 
one messages. Users can also create their buddy lists 
and be alerted via SMS when one of them is available 
(Figure 2). Multimode chat has showed itself to be 
essential in working communities that need to keep in 



touch on the fly, for example, among study groups 
and in distance learning courses. When public chats 
are held with a teacher, students part of a buddy 
group get reminders of appointments. Those who 
cannot access the Net via PCs can participate in the 
discussion via mobile devices. Again, an example of 
usage is given by journalists who log in the chat via 
mobile phones and can send information to their 
teams and colleagues in real time while interview- 
ing someone or during press conferences. Mobile 
chats are also useful for security guards who can 
keep in touch with each other through SMS or 
WAP/GPRS: They cannot be heard because they 
do not need to talk, allowing them to share secure 
information in a simple way. 

PageMaker 

PageMaker is a publishing tool that makes the 
creation of personal WAP pages easy and immedi- 
ate. Users can publish their own content (text and 
images, and now also colour images with new 
devices) and protect it with a password. They can 
also use advanced interactive features such as 
personal chats and message boards accessible via 
the Internet or mobile devices. PageMaker has 
been adopted by all kinds of professionals willing to 
promote their activities. Lawyers promote their 
studios as do shoppers. Users searching for info 
can get localised results thanks to various position- 
ing systems that have been implemented. Informa- 
tion sent to users can be enriched by interactive 
maps to help reach places. Potentialities of the 
medium have been exploited by marketing manag- 
ers: Coupons and special offers can now reach the 
target in an unprecedented way: reaching them on 
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a personal device with relevant information — a spe- 
cial offer responding to a specific need. 

Multimode Mail 

Multimode mail uses personal mailboxes accessible 
via PC and mobile devices. SMS is used to get alerts 
on new mails. All mail functionalities are available 
from the WAP interface. This service proves to be 
very useful for working communities of every kind. 

Event Enhancer 

Event enhancer is a complete suite of multimode 
software facilitating attendees and exhibitors during 
events. Users who subscribe to the service can 
receive information on locations and alerts on spe- 
cial events of their interest. A dedicated matchmaking 
engine also allows them to save time and effort in 
finding the right person at the right time: By inserting 
your profile and needs, you will be put in contact with 
the person or company you need to meet. Users who 
have been matched can also chat via SMS or WAP 
before meeting, exchange business cards, and down- 
load commercial information on Bluetooth-enabled 
handsets. Users can also book interesting events via 
SMS or WAP. The application has been adopted by 
schools: Courses have their own schedules available 
via mobile devices, teachers who give their contact 
information can be reached at any moment, and 
students can enroll in classes, seminars, or special 
courses at the last minute (in the Italian school 
system). But the event enhancer has also turned out 
to be a very useful and successful application for 
companies. It has been adopted as a marketing tool 
to optimize ROI on fairs and events: Procter & 
Gamble first adopted it at the international beauty 
fair in 2001, setting an example for others. Event 
Enhancer in fact helped in driving traffic to the stand 
and offered personalized service: Attendees were 
given the chance to set an appointment and get an 
SMS reminder, receive personalized advice in their 
mobile mail or via SMS, get a mobile coupon, and 
participate in an instant-win competition. Many other 
applications may be mentioned. Sometimes a simple 
SMS can improve productivity or facilitate work. An 
example is given by referees. When the match ends, 
each referee has to send the official score to the 



national federation. Once referees had to use faxes, 
but now a simple text message to a dedicated service 
number is all they have to give. 

Transactions and Error Rates 

In the following, we provide some interesting statis- 
tics (April 2001). 

• Message Board: There are 150,000 active 
users, and 78% of them access the boards from 
mobile devices via WAP or SMS. The average 
number of transactions via SMS are 18 per 
month. The error rate is 7%. There are 20,000 
new monthly users, and the churn rate is 7%. 

• Multimode Chat: There are 100,000 active 
registered users, with 90% accessing from 
mobile devices via WAP or SMS. The average 
number of monthly transactions via SMS is 20. 
There is an error rate of 5.5%. There are 
20,000 new monthly users, and there is a churn 
rate of 15%. 

• Multimode Mail: There are 250,000 active 
users of multimode mail. The average number 
of transactions via SMS is 24 per month. An 
error rate of 3% exists. The number of new 
monthly users is 22,000. The churn rate is 3%. 

• PageMaker: PageMaker has 70,000 active 
users, with 60% accessing from mobile devices 
via WAP or SMS. The average number of 
monthly transactions via SMS is 10. The error 
rate is 7%. The number of new monthly users 
is 6,000, and the churn rate is 7%. 

• Event Enhancer: There are 120,000 active 
users, and 78% of them access the service 
from mobile devices via WAP or SMS. The 
average number of transactions via SMS is 10 
per month. There is an error rate of 4%. The 
number of new monthly users is 4,000. The 
churn rate is 11%. 



FUTURE TRENDS 

Small personal interfaces, such as those on mobile 
phones, interconnected with other surrounding plat- 
forms (e.g., interactive TV, PCs, PDAs [personal 
digital assistants], in-car navigators, smart-house 
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appliances, etc.) and particularly suitable for con- 
text-awareness applications (Schilit, Adams, & 
Want, 1994) will strongly stimulate the development 
and diffusion of the prospected ubiquitous communi- 
cation scenarios. These new scenarios will imply the 
need to rethink new kinds of services and applica- 
tions and of course new forms of content. A fertile 
research area in this sense regards the design of 
applications and services for the mobile worker 
(Winslow & Bramer, 1994). 

Designing complex ubiquitous communication 
scenarios for work involving cross-platform cus- 
tomer technologies (ranging from I-TV, radio, mu- 
sic, and mobile phones to portable or wearable 
information devices) for different users and con- 
texts requires an original way of conceiving the 
interactive user experience. This need to design 
novel ubiquitous and mobile services and products 
that will address the new demands, requirements, 
and potentials of mobile workers in critical situations 
implies a new approach to design that goes beyond 
the existing conventions. This design for innovation 
will lead to the identification of novel experience 
models and their social, cultural, and regulatory 
implications, allowing us to explore new and relevant 
interactive forms and paradigms 

Potential challenges will be the creation of en- 
hanced network-enabled capability in distributed 
intelligent systems through superior context aware- 
ness (Tamminen, Oulasvirta, & Toiskallio, 2004), 
collaborative planning and replanning and coherency 
in asynchronous joint and collaborative work (Luff 
& Heath, 1998), and the improvement of human 
communication-systems effectiveness in general. 
These integrated systems should contribute to im- 
proving information operator uptake under stress, 
augmenting cognition and decision making, achiev- 
ing information and knowledge advantage, increas- 
ing decision agility and decision efficacy, and main- 
taining good strategic and supervisory control of 
mixed distributed autonomous and manned assets 
(Suchman, 1995). 

In this sense, we can anticipate the use of intel- 
ligent agents (acting as information brokers) embed- 
ded in ubiquitous systems (Maes, 1991) aimed to 
improve the effectiveness and accessibility of hu- 
man interaction with context-awareness technolo- 
gies. These agents need to have the following char- 



acteristics: the capabilities of learning, organising, 
carrying out routine tasks, and taking autonomous 
decisions for support in case of unexpected events. 
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CONCLUSION 

As the Italian case shows, it was not necessary to 
wait until the arrival of the more promising 3G (third- 
generation) technologies to have successful mobile 
interactive services that enhance communications in 
work environments (Cereijo Roibas et al., 2002). 
This case demonstrates that it is possible to design 
useful and usable services for a starting poor tech- 
nology despite its HCI (human-computer interac- 
tion) limitations (Cereijo Roibas et al.). This experi- 
ence shows how extensive attention to the mobile 
user is essential in order to envision realistic and 
relevant scenarios of use (Kleinrock, 1996). 
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KEY TERMS 

Decision-Support Systems: Software designed 
to facilitate decision making, particularly group deci- 
sion making. 

Ethnography: An approach to research that 
involves in-depth study through observation, inter- 
views, and artefact analysis in an attempt to gain a 
thorough understanding from many perspectives. 

GPRS (General Packet Radio Service): A 

standard for wireless communications that runs at 
speeds up to 115 kilobits per second, compared to 
current GSM (global system for mobile communica- 
tions) systems’ 9.6 kilobits. GPRS supports a wide 
range of bandwidths, is an efficient use of limited 
bandwidth, and is particularly suited for sending and 
receiving small bursts of data, such as e-mail and 
Web browsing as well as large volumes of data. 

MMS (Multimedia Message Service): A 

store-and-forward method of transmitting graphics, 
video clips, sound files, and short text messages over 
wireless networks using the WAP protocol. Carriers 
deploy special servers, dubbed MMS centers 
(MMSCs), to implement the offerings on their sys- 
tems. MMS also supports e-mail addressing so the 
device can send e-mails directly to an e-mail ad- 
dress. The most common use of MMS is for commu- 
nication between mobile phones. MMS , however, is 
not the same as e-mail. MMS is based on the concept 
of multimedia messaging. The presentation of the 
message is coded into the presentation file so that 
the images, sounds, and text are displayed in a 
predetermined order as one singular message. MMS 
does not support attachments as e-mail does. 

Multimode: Service that can be accessed and 
used with different interfaces in a multiplatform 
system (e.g. , a chat that is available across handhelds 
and PCs). 

Shared-Window System: System that allows a 
single-user application to be shared among multiple 
users without modifying the original application. 
Such a system shows identical views of the applica- 
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tion to the users and combines the input from the 
users or allows only one user to input at a time. 

SMS (Short-Message Service): A text-mes- 
sage service offered by the GSM digital cellular- 
telephone system. Using SMS, a short alphanumeric 
message can be sent to a mobile phone to be 
displayed there, much like in an alphanumeric pager 
system. The message is buffered by the GSM net- 
work until the phone becomes active. Messages 
must be no longer than 160 alphanumeric characters 
and contain no images or graphics. 

Usability Lab: A lab designed for user testing, 
typically a quiet room with computer equipment and 



a space for an observer to sit, along with a special 
observation area. 

User Studies: Any of the wide variety of meth- 
ods for understanding the usability of a system based 
on examining actual users or other people who are 
representative of the target user population. 
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User Testing: A family of methods for evaluat- 
ing a user interface by collecting data from people 
actually using the system. 



WAP (Wireless Application Protocol): A pro- 
tocol used with small handheld devices and small file 
sizes. 
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INTRODUCTION 

Information and Communication Technologies, 
known as ICT, have undergone dramatic changes in 
the last 25 years. The 1980s was the decade of the 
Personal Computer (PC), which brought computing 
into the home and, in an educational setting, into the 
classroom. The 1990s gave us the World Wide Web 
(the Web), building on the infrastructure of the 
Internet, which has revolutionized the availability 
and delivery of information. In the midst of this 
information revolution, we are now confronted with 
a third wave of novel technologies (i.e., mobile and 
wearable computing), where computing devices al- 
ready are becoming small enough so that we can 
carry them around at all times, and, in addition, they 
have the ability to interact with devices embedded in 
the environment. 

The development of wearable technology is per- 
haps a logical product of the convergence between 
the miniaturization of microchips (nanotechnology) 
and an increasing interest in pervasive computing, 
where mobility is the main objective. The miniatur- 
ization of computers is largely due to the decreasing 
size of semiconductors and switches; molecular 
manufacturing will allow for “not only molecular- 
scale switches but also nanoscale motors, pumps, 
pipes, machinery that could mimic skin” (Page, 
2003, p. 2). This shift in the size of computers has 
obvious implications for the human-computer inter- 
action introducing the next generation of interfaces. 
Neil Gershenfeld, the director of the Media Lab's 
Physics and Media Group, argues, “The world is 
becoming the interface. Computers as distinguish- 
able devices will disappear as the objects them- 
selves become the means we use to interact with 
both the physical and the virtual worlds” (Page, 
2003, p. 3). Ultimately, this will lead to amove away 



from desktop user interfaces and toward mobile 
interfaces and pervasive computing. 

BACKGROUND 

Mobile computing supports the paradigm of any- 
time-anywhere access (Perry et al., 2001), meaning 
that users have continuous access to computing and 
Web resources at all times and where ever they may 
be. Used in a wide range of contexts, mobile com- 
puting allows: 

1 . The extension of mobile communications and 
data access beyond a desktop and static loca- 
tion. 

2. Access to electronic resources in situations 
when a desktop/laptop is not available. 

3. Communication with a community of users 
beyond the spatio/temporal boundaries of the 
work or home location. 

4. The ability to do field work; for example, data 
collection, experience recording, and 
notetaking. 

5. Location sensing facilities and access to ad- 
ministrative information. 

Mobile devices have several limitations due to 
their small size (form factor) that need to be consid- 
ered when developing applications: 

1. Small Screen Size: This can be very limited, 
for example, on mobile phones. Solutions to this 
problem necessitate innovative human-com- 
puter interaction design. 

2. Limited Performance: In terms of processor 
capability, available memory, storage space, 
and battery life. Such performance issues are 
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continuously being improved, but to counter 
this, users’ expectations also are growing. 

3. Slow Connectivity: Relatively slow at the 
moment for anywhere Internet connectivity; 
3G technologies promise to improve the situa- 
tion. Wireless LAN connectivity, such as 
802.11, provides simple and reliable perfor- 
mance for localized communication. 

Mobile devices generally support multimodal in- 
terfaces, which ease usability within the anytime- 
any where paradigm of computing. Such support 
should include: 

• Pen input and handwriting recognition soft- 
ware. 

• Voice input and speech recognition software. 

• Touch screen, supporting color, graphics, and 
audio where necessary. 

In order to take advantage of the promise of 
mobile computing devices, they need to have oper- 
ating systems support such as: 

• A version of Microsoft Windows for mobile 
devices. 

• Linux for mobile devices. 

• Palm for PDAs. 

• Symbian for mobile phones. 

In addition, mobile devices need to support appli- 
cations-development technologies such as: 

• Wireless Application Protocol (WAP), where 
in the current version content is developed in 
XHTML, which extends HTML and enforces 
strict adherence to XML (extensible Markup 
Language). 

• J2ME (Sun Java 2 Micro Edition), which is a 
general platform for programming embedded 
devices. 

• .NET framework, which includes Microsoft’s 
C# language as an alternative to Java. 

• NTT DoComo’s i-rnode, which currently cov- 
ers almost all of Japan with well over 30 million 
subscribers. Phones that support i-mode have 
access to several services such as e-mail, 
banking, news, train schedules, and maps. 



Standard software tools also should be available 
on mobile devices to support, among other applica- 
tions: 
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• E-mail. 

• Web browsing and other Web services. 

• Document and data handling, including com- 
pression software. 

• Synchronization of data with other devices. 

• Security and authentication. 

• Personalization and collaboration agents. 

• eLearning content management and delivery, 
which is normally delivered on mobile devices 
via its Web services capability. 

Apart from the last two, these tools are widely 
available, although the different platforms are not 
always compatible. This is not a major problem, 
since communication occurs through standard Web 
and e-mail protocols. Current personalization and 
collaboration tools are based mainly on static profil- 
ing, while what is needed is a more dynamic and 
adaptive approach. There are still outstanding issues 
regarding content management and delivery of 
eLearning materials, since these technologies, which 
we assume will be XML-centric, are still evolving. 



HCI AND MOBILE AND 
WEARABLE DEVICES 

This article will highlight some of the central HCI 
issues regarding the design, development, and use of 
mobile and wearable devices. Our review pertains to 
devices such as mobile phones, personal digital 
assistants (PDAs), and wearable devices, and less 
to mobile devices such as laptops and tablet PCs that 
generally are larger in size. 

Several main issues regarding the HCI issues of 
using mobile and wearable devices have been pos- 
ited in the literature, including contextual concerns 
(Lumsden & Brewster, 2003; Sun, 2003), limitations 
of the interface (Brewster, 2002), and their conver- 
gence with other technologies and systems. These 
devices reflect the range of different contexts that 
mobile and wearable technology can be used for 
interfacing with data sets, interactive content, and 
enhanced visual display that augment activities and 
exploration within physical environments. 
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Table 1. Summary of a selection of mobile and wearable interaction tools and interfaces 



Interaction 
Tool/Interface Type 


Example 


Description 


Reference 


Gestural interfaces 


Georgia Tech 
Gesture toolkit 


The Georgia Tech toolkit allows for 
those developing gesture-based 
recognition components of larger 
systems. The toolkit is based upon 
Cambridge University’s voice 
recognition toolkit and uses hidden 
Markov models. 


Westeyn et al. (2003) 


Voice input devices 


Wearable 
Microphone 
Array (WMA) 


The Wearable Microphone Array 
provides an interface between 
context aware speech and the 
wearable computer. The system is 
specially adapted for mobile use and 
is worn on a tie or shirt. 


Xu et al. (2004) 


Wearable orientation 
interfaces 


Wearable 

orientation 

system 


The wearable orientation system 
tested three different interfaces: a 
virtual sonic beacon, speech output, 
and a shoulder-tapping system. The 
latter two interfaces were found to 
be helpful for those with sight 
impairments. 


Ross and Blasch (2002) 


Wearable orientation 
interfaces 


CyberJacket and 
Tourist Guide 


The CyberJacket incorporates a 
tourist guide for allowing visitors to 
the area to orientate more rapidly. 
The system incorporates an 
accelerometer device, a GPS 
location sensor, a sound card, and a 
processor with Web browser. 


Randell and Muller 
(2002) 


Mobile augmented 
reality 


Outdoor Virtual 
Reality 


Outdoor Virtual Reality combines 
an HMD, Tinmith-evo5 software 
architecture, and a tracking device to 
allow virtual and real objects to be 
interacted with on the move and 
outside. The authors have developed 
two applications from their system: 
a 3D visualization tool and an 
outdoor game (ARQuake). 


Thomas et al. (2002a). 


Audio interfaces 


Ensemble 


Ensemble uses garments fitted with 
light sensors, accelerometers, and 
pressure sensors as an interface for 
children learning about music. MIDI 
controllers and electronic musical 
instruments also are integrated. The 
system allows the children to 
explore the relation between actions 
and sounds. 


Andersen (2004) 


Smart clothing 


WearARM 


The WearARM provides 
computation power with a design 
that blends into existent clothing, 
strapping around the arm underneath 
your clothing. Intended mainly as a 
research platform, it will be 
integrated into the MIThrill (see the 
following). 


Anliker et al. (2002) 



According to some commentators, “the design of 
interaction techniques for use with mobile and wear- 
able systems has to address complex contextual 
concerns” (Lumsden & Brewster, 2003, p. 197). 
While the physical environment that the mobile user 
inhabits is constantly changing, there is a host of 
environmental issues to contend with, including pri- 
vacy, noise levels, and general interruptions to the 
flow of communications and data access. While 



many current wearable systems are built on mobile 
technology components such as PDAs, these do not 
always provide the best interfaces for maximizing 
wearability, relying as they do upon graphical and 
visual interfaces. 

Developers have met this challenge by designing 
a whole range of new and adapted interfaces in 
order to provide eyes- and hands- free interaction. 
A review of some of the recent mobile and wear- 
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Table 1. Summary of a selection of mobile and wearable interaction tools and interfaces, cont. 
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Smart clothing 


Smart clothing 
prototype for the 
Arctic 

environment 


The smart clothing prototype for the 
Arctic includes a suit with 
communication, global positioning 
and navigation, user and 
environment monitoring, and 
heating. 


Rantanen et al. (2002) 


Touch pad interface 


Touchpad mouse 

Wearable 

computers 


The touchpad mouse can be used as 
a component with other wearable 
computers (i.e., with wearable 
computer and HMD). The touchpad 
can be worn in a number of different 
positions on the body; however, 
testing has shown that the preferred 
place is on the thigh. 


Thomas et al. (2002b) 


Peephole displays that 
combine pen input with 
spatially aware displays 


PDAs 


Peephole displays that combine pen 
input with spatially aware displays, 
enabling navigation through objects 
that are larger than the screen. 


Yee (2003) 


Body area computing 
system 


Wearable Unit 
with 

Reconfigurable 

Modules 

(WURM) 


Plessl et al. argue that future 
wearable computing systems should 
be regarded as embedded systems 
and suggest the development of a 
body area computing system 
composed of distributed nodes 
around a central communications 
network. Sensors are distributed 
around the body using field- 
programmable arrays (FPGAs). 


Plessl et al. (2003) 



able technology interfaces has found the following 
interaction tools and interfaces (see Table 1). 

As these divergent interfaces indicate, there is as 
yet no preferred interface for wearable technology, 
and the scope for HCI input into design issues clearly 
is needed to inform future integrated systems. In 
addition to providing more mobile and embedded 
interfaces, other design parameters have attempted 
to address individual user difficulties inherent in 
traversing the physical environment while communi- 
cating, and some have been aimed specifically at 
user groups, including those with hearing or sight 
impairments (Ross & Blasch, 2002). 

EXAMPLES OF WEARABLE AND 
MOBILE DEVICES 

Wearable devices are distinctive from other mobile 
devices by allowing hands-free interaction or by at 
least minimizing the use of a keyboard or pen input 
when using the device. This is achieved by devices 
that are worn on the body, such as a headset that 
allows voice interaction and a head mounted display 
that replaces a computer screen. The area of wear- 
able devices is currently a hot research topic with 



potential applications in many fields (e.g. aiding 
people with disabilities). In addition to the interfaces 
that we already have mentioned, we have reviewed 
three examples of mobile and wearable devices. 

The IBM Linux Watch 

(www.research.ibm.com/WearableComputing/ 

factsheet.html) 

IBM recently has developed a wristwatch computer 
that they collaboratively are commercializing with 
Citizen under the name of WatchPad. Apart from 
telling the time, WatchPad supports calendar sched- 
uling, address book functionality, to-do-lists, the 
ability to send and receive short e-mail messages, 
Bluetooth wireless connectivity, and wireless ac- 
cess to Web services. WatchPad runs a version of 
the Linux operating system allowing a very flexible 
software applications development platform. It is 
possible to design WatchPad for specific users (e.g., 
a student’s watch could hold various schedules and 
provide location sensing and messaging capabili- 
ties). A recent commercial product with overlapping 
functionally, called Wrist Net Watch (www. fossil, 
com/tech), has been developed by Fossil. Current 
information such as news headlines and weather is 
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delivered in real time to the watch through the MSN 
Direct service. 

Xybernaut Mobile Assistant 

(www.xy bernaut.com/Solutions/product/ 
mav _product.htm ) 

This commercial product is the most widely avail- 
able multi-purpose wearable device currently on the 
market. It is a lightweight wearable computer with 
desktop/laptop capabilities, including wireless Web 
connectivity and e-mail, location sensing, hands-free 
voice recognition and activation, access to data in 
various forms, and other PC-compatible software. It 
has a processor module that can be worn in different 
ways, a head-mounted display unit, a flat-panel 
display that is touch-screen activated and allows pen 
input, and a wrist-strapped mini-keyboard. Xybernaut 
is currently trialling the use of the mobile assistant in 
an educational context, concentrating on students 
with special needs. It allows the student full comput- 
ing access beyond the classroom, including the 
ability to do standard computing functions such as 
calculations, word processing, and multi-media dis- 
play and, in addition, has continuous Internet con- 
nectivity and voice synthesis capabilities. It also 
supports leisure activities, such as listening to music 
and playing games. 

iButtons 

(www.ibutton.com/ibuttons/index.html) 

iButtons developed by Dallas Semiconductor Cor- 
poration/Maxim currently are being piloted in a 
range of educational institutions. An iButton is a 
computer chip enclosed in a durable stainless steel 
can. Each can of an iButton has a data contact 
(called the lid) and a ground contact (called the base) 
that are connected to the chip inside the can. By 
touching each of the two contacts, it is possible to 
communicate with an iButton, and iButtons are 
distinguished from each other by each having a 
unique identification address. By adding different 
functionality to the basic iButton (i.e. memory, a 
real-time clock, security, and temperature sensing), 
several different products are being offered. There 
are many applications for this technology, including 
authentication and access control, eCash, and a 
range of other services. In educational contexts, 



these smart buttons allow registration of students as 
well as access to classrooms, Web pages, and 
computers. 

MIThril: A Platform for Context-Aware 
Wearable Computing 

(www.media.mit.edu/weara bles/m i th ri l/) 

MIThril is a wearable research platform developed 
at the MIT Media Lab (DeVaul et al., 2001). Al- 
though not a commercial product, MIThril is indica- 
tive of the functionality that we can expect in next- 
generation wearable devices. Apart from the hard- 
ware requirements, it includes a wide range of 
sensors with sufficient computing and communica- 
tion resources and the support for different kinds of 
interfaces for user interaction, including a vest. 
There are also ergonomic requirements that include 
wearability (i.e. the device should blend with the 
user’s ordinary clothing) and flexibility (i.e., the 
device should be suitable for a wide range of user 
behaviors and situations). 

As an application of this architecture, a reminder 
delivery system called Memory Glasses was devel- 
oped, which acts on user-specified reminders (e.g. 
“During my next lecture, remind me to give addi- 
tional examples of the applications of wearable 
computers”) and requires a minimum of the wearer’s 
attention. Memory Glasses uses a proactive re- 
minder system model that takes into account time, 
location, and the user’s current activities based on 
daily events that can be detected (i.e. entering or 
leaving an office). 

FUTURE TRENDS 

Wearable and mobile devices currently are being 
used in a range of contexts, but they also are being 
used in conjunction with a range of other technolo- 
gies that may have implications for the evolution of 
human-computer interfaces. Possible uses might 
include the use of wearable and mobile devices for 
outdoor activities; for example, Cheok, et al. (2004) 
consider the use of wearable devices in conjunction 
with game play that links virtual and real spaces (Xu 
et al., 2003). Wearables also might allow users to 
explore access to a range of personalized informa- 
tion services integrating access through portal sys- 
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for products at a relatively early stage of their 
development. It is not hard to predict that the tech- 
nological issues we have touched upon will continue 
to be addressed and improved. Regarding standards, 
we expect current ones to evolve in parallel with 
new developments, but due to the experimental 
nature of some of these devices, there will be periods 
where non-standard appliances will be piloted. 

Personalization of user interaction is also an 
important issue, where adaptation to the user behav- 
ior is critical, easing the customization of the inter- 
face to suit users’ specific needs within the context 
of the device being used (Weld et al., 2003). Ad- 
vances in machine learning and artificial intelligence 
on the one hand and information overload on the 
other have led to a new challenge of building endur- 
ing personalized cognitive assistants that adapt to 
their users by sensing the user’s interaction with the 
environment; it can respond intelligently to a range 
of scenarios that may not have been encountered 
previously and also can anticipate what is the next 
action to be taken (Brachman, 2002). 

Finally, it is also important to investigate the 
social potential and impact of wearable and mobile 
devices (Kortuem & Segall, 2003) so that collabora- 
tive systems can be developed to facilitate and 
encourage interaction among members of the com- 
munity. One possible educational application of such 
a collaborative system may be an interactive learn- 
ing environment that supports a range of mobile and 
wearable devices in addition to integrating a range of 
learning services. 



Andersen, K. (2004). ensemble: Playing with sen- 
sors and sound. Proceedings of the Conference 
on Human Factors and Computing Systems, 
Vienna, Austria. 

Anliker, U., Lukowicz, P., Troester, G., Schwartz, S. 
& DeVaul, R.W. (2002). The WearARM: Modular, 
high performance, low power clothing platform de- 
signed for integration into everyday clothing. Pro- 
ceedings of the 5 th International Symposium on 
Wearable Computers. 

Brachman, R. (2002). Systems that know what they 
are doing. IEEE Intelligent Systems, 17(6), 67-71. 

Brewster, S. (2002). Overcoming the lack of screen 
space on mobile computers. Pei'vasive and Ubiqui- 
tous Computing, 6(3), 188-205. 

Cheok, A.D., et al. (2004). Human personal & 
ubiquitous computing, 8(2), 71-81. 

DeVaul, R.W., Schwartz, S., & Pentland, A. (2001). 
MIThrill: Context-aware computing for daily life. 
Retrieved August 1, 2004, from http:// 
www.media.mit.edu/wearables/mithril/MIThrill.pdf 

Di Pietro, R., & Mancini, L.V. (2003). Security and 
privacy issues of handheld and wearable wireless 
devices. Communications of the ACM, 43(9), 75- 
79. 

Kortuem, G., & Segall, Z. (2003). Wearable com- 
munities: Augmenting social networks with wear- 
able computers. IEEE Pervasive Computing, 2(1), 
71-78. 

Lumsden, J., & Brewster, S. (2003). A paradigm 
shift: Alternative interaction techniques for use with 
mobile and wearable devices. Proceedings of the 
2003 Conference of the Centre for Advanced 
Studies on Collaborative Research, Toronto, 
Canada. 

Page, D. (2003). Computer Ready to Wear. High 
Technology Careers. Retrieved May 6, 2003, from 
www.hightechcareers.com/doc799/readyto 
wear799.html 

Perry, M., O’Hara, K., Sellen, A., Brown, B. & 
Harper, R. (2001). Dealing with mobility: Under - 



711 




Wearable and Mobile Devices 



standing access anytime, anywhere. ACM Trans- 
actions on Computer-Human Interaction, 5(4), 
323-347. 

Piekarski, W., & Thomas, B.H. (2004). Interactive 
augmented reality techniques for construction at a 
distance of 3D geometry. Proceedings of the Work- 
shop on Virtual Environments 2004, Zurich, Swit- 
zerland. 

PlessI, C., et al. (2003). The case for reconfigurable 
hardware in wearable computing. Personal and 
Ubiquitous Computing, 7, 299-308. 

Randell, C., & Muller, H.L. (2002). The well-man- 
nered wearable computer. Personal and Ubiqui- 
tous Computing, 6, 31-36. 

Rantanen, J., et al. (2002). Smart clothing prototype 
for the Arctic environment. IEEE Pervasive and 
Ubiquitous Computing, 6, 3-16. 

Ross, D.A., & Blasch, B.B. (2002). Development of 
a wearable computer orientation system. Personal 
and Ubiquitous Computing, 6, 49-63. 

Sun, J. (2003). Information requirement elicitation in 
mobile commerce. Communications of the ACM, 
46(12), 45-47. 

Thomas, B., et al. (2002a). First person indoor/ 
outdoor augmented reality application: ARQuake. 
Personal and Ubiquitous Computing, 6, 75-86. 

Thomas, B., Grimmer, K., Zucco, J., & Milanese, S. 
(2002b). Where does the mouse go? An investiga- 
tion into the placement of a body-attached touch-pad 
mouse for wearable computers. Personal and Ubiq- 
uitous Computing, 6, 97-112. 

Weld, D., et al. (2003). Automatically personalizing 
user interfaces. Retrieved May 12, 2003, from 
www.cs.washington.edu/homes/weld/papers/weld- 
ijcai03.pdf 

Westeyn, T., Brashear, H., Atrash, A., & Starner, T. 
(2003). Multimodal architectures and frameworks: 
Georgia Tech gesture toolkit: Supporting experi- 
ments in gesture recognition. Proceedings of the 
5 th International Conference on Multimodal In- 
terfaces, Vancouver, Canada (pp. 85-92). 



Xu, K., Prince, J.D., Cheok, A.D., Qiu, Y., & 
Kumar, K. G. (2003). Visual registration for unpre- 
pared augmented reality environments. Personal 
and Ubiquitous Computing, 7(5), 287-298. 

Xu, Y., Yang, M., Yan, Y„ & Chen, J. (2004). 
Wearable array as user interface. Proceedings of 
the 5"’ Australasian User Interface Conference 
(AUIC2004). 

Yee, K-P. (2003). Peephole displays: Pen interac- 
tion on spatially aware handheld computers. Pro- 
ceedings of the ACM Conference on Human 
Factors in Computing Systems 2003, Ft. Lauder- 
dale, Florida. 

KEY TERMS 

Hands-Free Operation: Allows the user to 
interact with data and information without the use of 
hands. 

Head-Mounted Displays (HMDs): Visual dis- 
play units that are worn on the head as in the use of 
VR systems. 

Head-Up Displays (HUPs): Displays of data 
and information that are superimposed upon the 
user’s field of view. 

Mobile Devices: Can include a range of por- 
table devices, including mobile phones and PDAs, 
but also can include wearable devices, such as 
HMDs and smart clothing, that incorporate sensors 
and location tracking devices. 

Multimodal Interaction: Uses more than one 
mode of interaction and often uses visual, auditory, 
and tactile perceptual channels of interaction. 

Pervasive and Context-Aware Computing: 

Allows mobile devices to affect everyday life in a 
pervasive and context-specific way. 

Wearable Devices: May include microproces- 
sors worn as a wristwatch or as part of clothing. 

Wearable Sensors: Can be worn and detected 
by local computing systems. 
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INTRODUCTION 

Credibility evaluation processes on the World Wide 
Web are subject to a number of unique selective 
pressures . The Web ' s potential for supplying timely, 
accurate, and comprehensive information contrasts 
with its lack of centralized quality control mecha- 
nisms, resulting in its simultaneous potential for 
doing more harm than good to information seekers. 
Web users must balance the problems and potentials 
of accepting Web content and do so in an environ- 
ment for which traditional, familiar ways of evaluat- 
ing credibility do not always apply. Web credibility 
research aims to better understand this delicate 
balance and the resulting evaluation processes em- 
ployed by Web users. 

This article reviews credibility conceptualizations 
utilized in the field, unique characteristics of the Web 
relevant to credibility, theoretical perspectives on 
Web credibility evaluation processes, factors influ- 
encing Web credibility assessments, and future 
trends. 



BACKGROUND 

Credibility is one of several dimensions that influ- 
ence message persuasiveness (Petty & Cacioppo, 
1986), attitudes toward an information source (Sundar, 
1999), and behaviors relevant to message content 
(Petty & Cacioppo, 1981). While credibility is largely 
viewed as a source characteristic, attitudinal assess- 
ments relevant to credibility, including those made on 
the Web, are directed at messages (content), sources 
(information providers), and media (the Web itself). 

Conceptualizations of source credibility have tra- 
ditionally focused on two primary source attributes, 
expertise and trustworthiness (Hovland & Weiss, 
1951), and these conceptualizations have been influ- 
ential in Web credibility research (Fogg & Tseng, 
1999; Wathen & Burkell, 2002). Expertise refers to 
a source’s perceived ability to provide information 



that is accurate and valid (based on attributes such 
as perceived knowledge and skill), while trustwor- 
thiness refers to a source’s perceived willingness to 
provide accurate information given the ability (based 
on attributes such as perceived honesty and lack of 
bias; Flovland, Jannis, & Kelley, 1953). Thus, the 
underlying dimensions in conceptualizations of cred- 
ibility predominantly refer to perceived qualities. 
Particularly with respect to interactive systems, 
including the Web, existing research has focused 
primarily on factors influencing the perception of 
credibility as opposed to factors predicting objective 
measures of accuracy. 

Numerous related constructs have been investi- 
gated in the Web credibility literature, including 
believability (Flanagin & Metzger, 2000), which is 
arguably a synonymous construct of credibility (Tseng 
& Fogg, 1999); information completeness (Dutta- 
Bergman, 2004), referring to the extent to which 
necessary elements for confirming message accu- 
racy are present; cognitive authority (Rieh, 2002), 
referring to the extent to which users believe they 
can trust the information; and reputation (Toms & 
Taves, 2004), referring to future expectations of 
information quality and credibility. 

Attitudes toward messages that are relevant to 
credibility and its related constructs are determined 
at least by the characteristics of (and interactions 
amongst) the source, message, and receiver (Self, 
1996; Slater & Rouner, 1996). Such assessments 
are often extensions of source credibility: Credible 
sources are viewed as likely to produce credible 
messages. Particularly when constraints such as 
limited time, lack of ability, or low motivation force 
the user to focus on surface or peripheral features of 
the message, source, or medium in processing Web 
content, one may expect source credibility to heavily 
influence perceptions of message accuracy and 
information quality (see Petty & Cacioppo, 1986). 

In recognizing the frequent need for computer 
users to balance a range of information-seeking 
goals with the need for efficiency and productivity, 
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Fogg and Tseng (1999) have proposed four types of 
credibility in assessing interactive systems: pre- 
sumed, reputed, surface, and experienced. Pre- 
sumed credibility assessments are based upon gen- 
eral underlying assumptions about the system, for 
example, in assuming that Web sites in the dot-org 
domain are more credible than those in the dot-com 
domain. Reputed credibility assessments are based 
upon third-party reports or endorsements, for ex- 
ample, in finding pages linked to by a credible site as 
likely to provide accurate information. Surface cred- 
ibility assessments are based upon features observ- 
able via simple inspection, for example, in using 
visual design or interface usability as an indicator of 
credibility. Finally, experienced credibility is based 
upon first-hand experience with the system, for 
example, in returning to a Web site that has previ- 
ously provided information verified by the user to be 
accurate. 

Conceptualizations and taxonomies of credibility 
recognize the construct as not only referring to 
source characteristics, but as referring to attributes 
relevant to the perceived likelihood of message 
accuracy and validity. In so doing, they distinguish 
credibility from another related construct: trust. 
Trust relates more properly to the perceived likeli- 
hood of behavioral intentions, reliability, and depend- 
ability rather than message accuracy, and as Fogg 
and Tseng (1999) point out, the word is often used in 
phrases referring to credibility, such as “trust the 
information” and “trust the advice.” 

Given a grounding in the credibility concept, 
Web-credibility researchers have set out to 
operationalize the construct in a number of ways. As 
Wathen and Burkell (2002) point out, credibility may 
be operationalized by either direct or indirect assess- 
ment methods, both of which have been applied to 
Web credibility research. Researchers employ di- 
rect assessment methods by asking users to rate the 
extent to which the source, message, or medium is 
described by the underlying dimensions of credibil- 
ity. Indirect methods in the field include measuring 
attitude and behavior changes as a result of stimulus 
Web content. Moreover, the field is by no means 
limited in its range of methodological approaches. 
Experimental, quasi-experimental, and traditional 
and Web survey methods are all commonly em- 
ployed. Qualitative analyses, including interviews, 



case studies, and thinking-aloud protocols, are also 
employed to investigate user reasoning about cred- 
ibility. 

UNIQUENESS OF THE WEB 

The types of needs that trigger usage of the Web 
may be relatively similar to other media (Rieh & 
Belkin, 1998), and Sundar (1999) has found the 
underlying dimensions of W eb and traditional media 
credibility assessments to be similar. Rieh (2002), on 
the other hand, has since found that the range of 
evidence Web users consider in making these as- 
sessments is much wider than for other media, and 
even in cases where the factors considered are 
similar, they may be weighed differentially across 
media (Payne, Dozier, & Nomai, 2001). Moreover, 
the Web may be less credible than print newspapers 
(Flanagin & Metzger, 2000), but in some cases, more 
credible than traditional media counterparts such as 
television, radio, and magazines (Flanagin & Metzger; 
Johnson & Kaye, 1998). Finally, Klein (2001) has 
found users to be generally aware of credibility 
differences between the Web and other media. 

Given these differences, one may ask, What is 
special about the Web with respect to credibility? 
Researchers have theorized or empirically identi- 
fied a number of ways in which features of the Web 
may give rise to differences between online cred- 
ibility assessments and those made with traditional 
media. These explanations tend to focus on four 
general characteristics of the Web: (a) the relative 
lack of filtering and gatekeeping mechanisms, (b) 
the form of the medium, including interaction tech- 
niques and interface attributes either inherent to 
the Web and other hypertext systems or emergent 
from common design practices, (c) a preponder- 
ance of source ambiguity and relative lack of 
source attributions, and (d) the newness of the Web 
as a medium in conjunction with its lack of evalua- 
tion standards. 

Filtering Mechanisms 

Perhaps the most critical feature of the Web with 
respect to user credibility evaluations is its relative 
lack of centralized information filtering or quality 
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control mechanisms (Abdulla, Garrison, Salwen, 
Driscoll, & Casey, 2002; Andie, 1997; Flanagin & 
Metzger, 2000; Johnson & Kaye, 1998). In contrast 
to traditional media, Web users are free to upload 
information irrespective of scrutiny (Johnson & 
Kaye), and content is frequently made available 
without the benefit of editorials, reviews, and other 
gatekeeping procedures (Flanagin & Metzger). This 
lack of quality control can affect perceptions of 
credibility for the Web as a medium (Johnson & 
Kaye) and users’ evaluative processes, shifting their 
attribute focus (Rieh, 2002). 

The Web may have a property analogous to 
gatekeeping procedures, primarily in the form of 
ranking systems evaluating link structures as these 
structures provide predictive power over credibility 
assessments (Toms & Taves, 2004). Simultaneously, 
however, the Web fosters the incidental arrival at 
sites, and this may increase the likelihood of encoun- 
tering inaccurate information (Andie, 1997). 

Form 



(Flanagin & Metzger, 2000) due in part to both 
browser display and design conventions. 
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Source Ambiguity 



Web content often lacks a clear source. In many 
cases, the source is not present at all (Burbules, 
2001; Eastin, 2001), and in others it is present but 
not easy to ascertain (Toms & Taves, 2004). This 
problem is accompanied by users’ generally high 
reliance on source identity as a criterion for assess- 
ing information quality and credibility (Rieh & Belkin, 
1998). Such ambiguity also coincides with Rieh’s 
finding that a group of scholars showed a greater 
reliance on source identity at the institutional level 
(such as URL [uniform resource locator] domain 
type) than at the individual level (such as author 
credentials), which contrasts with findings regard- 
ing traditional media. Additionally, Web users often 
lack information about source reputation (Toms & 
Taves), a potentially unavoidable problem due to 
the number and diversity of online sources. 



The Web, much like the television, offers new form 
factors and interactive characteristics previously 
unavailable in information- seeking environments. Just 
as the television’ s multimodal properties altered evalu- 
ative processes and credibility perceptions (Newhagen 
& Nass, 1989), the Web’s unique interactive fea- 
tures, in conjunction with emerging design practices 
that further distinguish it from traditional media, may 
result in fundamentally different credibility-evalua- 
tion processes. 

The relative ease of data manipulation, duplica- 
tion, and dissemination is one critical characteristic of 
digital information systems. Web content is suscep- 
tible to frequent alteration (Metzger, Flanagin, & 
Zwarun, 2003), can easily be tailored to individual 
recipients (Campbell et al. , 1 999) , and is easily dupli- 
cated and widely replicated. This last attribute poten- 
tially has significant implications for within-medium 
verification procedures since inaccurate content may 
be replicated by its recipients with extraordinary 
ease. Unique evaluation processes may also result 
from the diverse and relatively unstructured organi- 
zation of content (Rieh, 2002), the lack of organiza- 
tional conventions (Burbules, 2001), and a relative 
difficulty in distinguishing content from advertising 



Infancy as a Medium 

Finally, one indicator of the Web’ s uniqueness with 
respect to credibility is the general concern over 
fostering Internet literacy (Greer, 2003) in addition 
to the acknowledged need for evaluation guidelines 
and assessment standards (Tate & Alexander, 1996; 
Wathen & Burkell, 2002). As Greer points out, 
evaluating credibility on the Web is not an easy task, 
and this difficulty is due in part to the need to learn 
new evaluative skills, such as checking URL do- 
mains. Not only must users master new evaluative 
techniques, but they must do so in an environment 
in which familiar ways of assessing credibility are 
less applicable (Burbules, 2001) and in contexts that 
may require them to rethink previous evaluative 
strategies. As Graefe (2003) points out, information 
objects on the Web (such as product descriptions) 
are removed from sensory information typically 
used to verify claims in real-world evaluative con- 
texts. While this is of course equally true of some 
non-Web contexts, e-commerce ups the ante: De- 
cisions about credibility and relevant behaviors must 
frequently be made wholly independent of normally 
available sensory information. 
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EVALUATIVE PROCESSES 

While relatively few theories of Web credibility 
evaluation have been proposed thus far, existing 
frameworks and perspectives provide useful ways 
of conceptualizing online credibility assessment pro- 
cesses. Fogg (2002, 2003a, 2003b) views credibility 
assessment as an iterative process resulting in the 
coordination of several component assessments of 
noticeable elements. Prominence interpretation 
theory posits two aspects of credibility assess- 
ments: (a) the likelihood of an element related to the 
source or message under evaluation being noticed 
(prominence), and (b) the value assigned to the 
noticed element based on the user’s judgment about 
how the element affects the likelihood of information 
accuracy (interpretation). Fogg identifies five fac- 
tors affecting prominence: user involvement, infor- 
mation topic, the task, experience level, and other 
individual differences such as the need for cognition. 
Three factors affecting interpretation are identified: 
user assumptions, skills and knowledge, and contex- 
tual factors such as the environment in which the 
assessment is made. The process of noticing promi- 
nent interface and message elements and assigning 
evaluative judgments to each occurs iteratively until 
the user reaches satisfaction with an overall cred- 
ibility assessment or reaches a constraint, such as 
lack of time. Fogg points out that seemingly discrep- 
ant findings in the Web-credibility literature on the 
effects of a particular factor (for example, whether 
privacy policies impact credibility assessments) may 
be explained parsimoniously if one of the studies is 
found to have focused on element prominence and 
the other on interpretation (Fogg, 2003a). 

Wathen and Burkell (2002) have conceptualized 
evaluative processes of Web credibility in terms of 
a stage model (with the caveat that their proposed 
stages may represent simultaneous evaluations). 
They distinguish between evaluations of surface 
credibility, message credibility, and content. In sur- 
face credibility assessments, users focus on presen- 
tational and organizational characteristics of a Web 
site, deciding whether the site is likely to provide the 
desired content. In message credibility assessments, 
users more thoroughly review indicators of source 



and message credibility, deciding whether the pro- 
vided information is likely to be believable. Finally, in 
content assessments, users integrate source evalu- 
ations with self-knowledge about their own exper- 
tise, domain knowledge, and information needs, de- 
ciding if and how to act on the information. If failure 
occurs at either the surface or message credibility 
assessment stages, the user is likely to leave the site. 
Influenced by the elaboration likelihood model (Petty 
& Cacioppo, 1986), Wathen and Burkell further 
suggest that the probability of leaving interacts with 
individual differences, such as the need for cogni- 
tion, need for the information, and motivation; if the 
user is highly motivated, surface features may be 
overlooked. 

Noticeably, verification procedures are not ex- 
plicitly included in Fogg’s (2002, 2003a, 2003b) or 
Wathen and Burkell’s (2002) theories, and this 
absence is supported by empirical research finding 
the verification of Web content to be infrequent 
(Metzger et al., 2003; Nozato, 2002). Their theories 
are complemented by a few theoretical perspectives 
in the Web-credibility literature. Rieh (2002), based 
on judgment and decision-making research, sug- 
gests Web users make at least two types of assess- 
ments: (a) predictive judgments prior to encounter- 
ing an information object based on existing knowl- 
edge and assumptions, and (b) evaluative judg- 
ments based on characteristics of the information 
object. Predictive assessments may also be based on 
characteristics of an information object’ s surrogate, 
such as hyperlinked text. 

In the context of consumer assessments of prod- 
uctquality, Graefe (2003), following Nelson (1974), 
provides a conceptualization more explicitly ac- 
counting for verification procedures, distinguishing 
between search qualities (discovered during in- 
spection of the information object), experience 
qualities (discovered only after use of the informa- 
tion object), and credence qualities (such as an 
information provider’s intentions) that cannot be 
verified and that introduce inherent risk into the 
assessment process. Graefe further points out that 
on the Web, search qualities often must take the 
place of experience qualities due to verification 
difficulties. 



716 



Web Credibility 



FACTORS AFFECTING 
WEB CREDIBILITY 

The influences of a number of Web content and 
individual site characteristics on credibility have 
been investigated in the research literature, with a 
few important general trends focusing on the impact 
of interface attractiveness, site-operator identity, 
advertising, individual differences, and the topic of 
the site’s content. 

Visual Design 

Credibility judgments often have a striking depen- 
dence on the surface assessments of visual appear- 
ance and interface characteristics, with profes- 
sional-looking design that is appropriate to site con- 
tent significantly increasing credibility (Eysenbach 
& Kohler, 2002; Fogg, Soohoo, Danielson, Marable, 
Stanford, & Tauber, 2003; Kim & Moon, 1998). The 
likability of Web sites is significantly affected by 
interface attractiveness (Roberts, Rankin, Moore, 
Plunkett, Washburn, & Wilch-Ringen, 2003), and 
likability in turn impacts credibility assessments 
(Cialdini, 2001). As Fogg, Soohoo, et al. (2003) point 
out, this is consistent with social psychological re- 
search indicating that attractiveness increases the 
credibility of human communicators. 

Identity 

A number of site characteristics impacting credibil- 
ity center around demonstrating the identity, contact 
information, and credentials of real individuals asso- 
ciated with the site (Fogg, Marshall, Faraki, et al., 
2001; Rieh, 2002). While personal photos can have 
either a negative or positive impact on trustworthi- 
ness depending on contextual factors (Fogg, Marshall, 
Kameda, et al., 2001; Riegelsberger, Sasse, & 
McCarthy, 2003), indicators of real human beings 
behind the site tend to increase credibility. The 
importance of identity in Web-credibility assess- 
ments is consistent with a strong reliance on source 
authority (Rieh & Belkin, 1998). 



hand, users have motivation to ignore advertising in 
favor of information-seeking goals (Greer, 2003), 
and on the other, they have motivation to examine 
advertising content as an indicator of source cred- 
ibility (Fogg, Soohoo, et al., 2003). Advertising can 
negatively impact credibility assessments, particu- 
larly when it is not clearly distinguished from site 
content (Fogg, Marshall, Faraki, et al., 2001). While 
relevance between advertising and site content can 
positively influence attitudes toward the advertise- 
ment (Cho, 1999; Shamdasani, Stanaland, & Tan, 
2001), it is less clear if the effect occurs in the 
opposite direction (Choi & Rifon, 2002). 
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Individual Differences 



A number of user characteristics impact credibility 
assessments of the Web as a medium and of indi- 
vidual sites. Web credibility research suggests fe- 
males generally find the Web more credible than 
males (Robinson & Kaye, 2000), older users find 
online news less credible than younger users (Johnson 
& Kaye, 1998), and college students find the Web 
more credible than the general adult population 
(Metzger et al., 2003). Younger users additionally 
tend to be more critical of typographical errors and 
broken links than older users (Fogg, Marshall, Faraki, 
et al., 2001). Both an experiment by Greer (2003) 
and a survey study by Nozato (2002) found a signifi- 
cant positive relationship between usage and per- 
ceptions of online news credibility. 

In addition to age, gender, and Web usage, a 
critical factor in Web credibility assessments is 
content-domain expertise. In a study comparing the 
assessments of health and finance experts to those 
of general Web users, Stanford, Tauber, Fogg, and 
Marable (2002) found health experts to focus on 
name reputation, source attributions, and company 
motive more than general Web users, and finance 
experts to focus more on the quantity of available 
information, company motive, and potential biases 
than general Web users. 

Content Domain 



Advertising 

The impact of advertising on site credibility appears 
to arise from two competing pressures. On the one 



Stanford et al. (2002) point out that there are inher- 
ent differences in the types of information provided 
within varying domains, including how established 
the information commonly tends to be and the typical 
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goals of information providers within that domain. 
These may lead to different expectations when 
users evaluate content. Rieh (2002) found evalua- 
tions of computer-related and medical information to 
rely more heavily on assessments of trustworthiness 
than for research and travel. Moreover, Rieh’ s work 
indicated less focus on the source in making credibil- 
ity assessments of travel sites, consistent with Fogg, 
Soohoo, et al. (2003), who also found the effect for 
search-engine sites. This last finding may reflect the 
unique nature of sites acting primarily as gatekeepers 
to other brand-name sites; the recognizable airlines 
and high-ranking sites pointed to may cause users to 
overlook the reputation of the gatekeeper itself, 
leading to the effect. 

FUTURE TRENDS 

Web credibility research has identified factors influ- 
encing credibility and evaluative strategies employed 
by users, but the area remains ripe for further 
empirical work. This section suggests three emerg- 
ing and important areas in the field: the relationship 
between network structures on the Web and cred- 
ibility, the effects of user motivation, and further 
theory development. 

Networks of Credibility 

Flypertext structures like the Web offer the oppor- 
tunity to understand the connection between net- 
work structures (both actual structure and the struc- 
ture perceived by users) and end-user credibility 
assessments. Toms and Taves (2004) provide an 
important step in this direction, showing that link 
structures are powerful indicators of not only rel- 
evance, but of credibility. The extent to which users 
explicitly or implicitly recognize these structures and 
employ them in credibility assessments is unknown. 

User Motivation 

Although influenced by Petty and Cacioppo’s (1986) 
elaboration likelihood model and their notions of 
central and peripheral routes to persuasion, little 
work has focused on the impact of motivation on 



Web credibility assessments. As Dutta-Bergman 
(2004) points out, the potential analogy between the 
notions of directed search and browsing and the 
notions of central and peripheral processing point the 
way to useful research. 

Theory Development 

Finally, it is worth noting that because Web credibil- 
ity research is a relatively new field of inquiry, 
theoretical frameworks that can drive systematic 
programs of empirical work are only recently begin- 
ning to appear. These early frameworks are critical, 
and there remains a need for the further develop- 
ment and empirical testing of Web-specific theories 
of credibility assessment. 

CONCLUSION 

Evaluative processes and credibility assessments on 
the Web arise out of complex interactions between 
characteristics of the user, the site under evaluation, 
and the Web as a medium. Fundamental character- 
istics of the Web act as pressures on information 
seekers, shaping which interface elements will be 
noticed, how they will be interpreted, and the evalu- 
ative processes users will employ in making credibil- 
ity assessments. 
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KEY TERMS 

Credibility: A characteristic of information 
sources that influences message persuasiveness, 
attitudes toward the information source, and behav- 
iors relevant to message content, consisting of two 
primary attributes: expertise and trustworthiness. 

Evaluative Judgment: An assessment based 
on characteristics of an information object indepen- 
dent of assessments based on information prior to 
encountering the object (predictive judgments). 

Experienced Credibility: A credibility assess- 
ment based upon first-hand experience with a system. 



Expertise: A source’s perceived ability to pro- 
vide information that is accurate and valid (based on 
attributes such as perceived knowledge and skill); 
with trustworthiness, it is one of two primary at- 
tributes of credibility. 
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Predictive Judgment: An assessment (prior to 
encountering an information object) based on exist- 
ing knowledge, assumptions, or the information 
object’s surrogate independent of assessments based 
on characteristics of the object (evaluative judg- 
ments). 



Presumed Credibility: A credibility assess- 
ment based upon general underlying assumptions 
about a system. 

Reputed Credibility: A credibility assessment 
based upon third-party reports or endorsements. 

Surface Credibility: A credibility assessment 
based upon features observable via simple inspec- 
tion. 



Trustworthiness: A source’s perceived will- 
ingness to provide accurate information given the 
ability (based on attributes such as perceived hon- 
esty and lack of bias); with expertise, it is one of two 
primary attributes of credibility. 



721 




722 



Web-Based Human Machine Interaction in 
Manufacturing 



Thorsten Blecker 

Hamburg University of Technology, Germany 

Gunter Graf 

University of Klagenfurt, Austria 



INTRODUCTION 

The quality of HMI in automation is an important 
issue in manufacturing. This special form of interac- 
tion occurs when the combination of human abilities 
and machine features are necessary in order to 
perform the tasks in manufacturing. Balint (1995) 
has identified three categories of such human-ma- 
chine systems: 

1. Machines might do the job without human 
involvement, but the feasibility is questionable. 
For example, weld seams in car assembly are 
made mostly autonomously by robots, but in 
many cases, humans have to guide the robot to 
the weld point, because the robot is not able to 
locate the point correctly, which is a relatively 
easy task for a human. 

2. Flumans might do the job without machines, but 
the efficiency/reliability is questionable. This is 
the case in almost all cases of automation (e.g., 
the varnishing of cars). 

3. FIMI is necessary (no purely machine- or hu- 
man-based execution is possible), although ro- 
bots today are widely in use; in many cases, 
they cannot substitute humans completely, be- 
cause the possible conflicts that can occur are 
so diverse that a robot alone cannot manage 
them. 

The term HMI is used widely for the interaction 
of a human and a somewhat artificial, automated 
facility, which is true in many situations, including 
HCI. In this article, we speak of HMI in industrial 
settings. We term the machine especially for indus- 
trial facilities for producing a certain (physical) 



output; in this case, the term man-machine interac- 
tion also is used synonymously for HMI. We define 
HMI as the relation between a human operator and 
one or more machines via an interface for embracing 
the functions of machine handling, programming, 
simulation, maintenance, diagnosis, and initializa- 
tion. 



BACKGROUND 

The interface between humans and machines gener- 
ally influences the quality of HMI, especially in the 
third category of the previously presented human- 
machine systems. The design of the interface be- 
tween humans and the machines has evolved dra- 
matically in recent decades (Nagamachi, 1992). The 
first step was mechanically controlled machines. 
With the rise of numerical control, the interaction 
between human and machines changed. In the sec- 
ond step, the operator no longer has an exact knowl- 
edge about how the machine is programmed and 
cannot influence the processes in the machine. The 
third step is computerized machines, where the 
operator can influence and program a wide array of 
parameters in the machine. In this step, computer- 
ized HMI becomes a central aspect in manufactur- 
ing on the shop floor. 

The advances of computerized techniques for 
enriching the interface allow a human-centered 
modification of HMIs. This enables an effective use 
of the skills and abilities of the operators of machines 
and the features of the machines themselves. Such 
a human-centered design of manufacturing tech- 
nologies should obey the following steps (Stahre, 
1995): 
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1 . Consider existing skills of the user. 

2. Facilitate the maximizing of operator choice 
and control. 

3. Integrate the planning, execution, and monitor- 
ing components. 

4. Design to maximize the operator ’ s knowledge. 

5. Encourage social communications and interac- 
tion. 



RISE OF WEB-BASED HMI 

The usage of interoperable, adaptive, and standard- 
ized information technologies on the shop floor is 
essential to solve the problems in human-centered 
manufacturing, in which the previously mentioned 
fulfill the criteria. Due to restrictions in the capability 
of computers and their associated technologies in 
the 1980s and 1990s, the computer interfaces were 
built upon those technological limits and were not 
oriented to an optimized effectiveness of the human 
machine interaction on the shop floor. In addition, 
HMI has been machine-specific up until now and 
bounded on the implementation by the facility ven- 
dor. The diffusion of Internet technologies within 
automation and new trends in automation technolo- 
gies provide the necessary infrastructure (Blecker, 
2003). The following trends are essential: 

1. Mobilization of Computers: For example, 
Web pads enable the mobilization of all interac- 
tions between humans and machines as well as 
between humans on the shop floor. 

2. Embedded Computing: Every machine may 
have an integrated full-featured computer that 
stores data, which provide a front end; it au- 
tonomously can sense and respond to the envi- 
ronment (by blinking, e-mail messages, soft- 
ware calls, etc.) and offers services for ma- 
chine maintenance and control. Embedded com- 
puters in machines and facilities on the shop 
floor induce the development of intelligent sys- 
tems in every machine. Here, intelligence means 
that the system can set a wide array of autono- 
mous (clearly predefined) actions on the oc- 
currence of certain events. 

3. Standardization of Networks: (Industrial) 
Ethernet replaces common field busses and 
proprietary networking. It is also compatible 



with wireless networks, which enable wireless 
communication on the shop floor. 
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Consequently, Internet technologies have be- 
come ubiquitously available on the shop floor. The 
data and computation services will be portably ac- 
cessible from many, if not most, locations on the shop 
floor. Internet technologies also trigger a standard- 
ization of the screen design and content distribution. 
This leads to a major change in the traditional HMI, 
especially for blue-collar workers. In fact, the inter- 
action between workers and machines approxi- 
mates the common screen handling of the office 
world. Therefore, we state that the human machine 
interaction is converging into a Web-Based Human 
Machine Interaction. 

Web-based HMI is an advanced and extended 
form of computerized HMI characterized by the 
logical separation of the computer unit from the 
machine itself. Internet technologies integrate the 
human as well as the machine within a corporate 
network. They make the entirely Web-based infor- 
mation infrastructure and all of the interaction part- 
ners connected to it available for the employees as 
well as the information systems on the shop floor. By 
using Web-based interfaces for user input, screens 
can be implemented or modified rapidly. Cost sav- 
ings are realized, since any device (mobile or fixed) 
that can support a browser becomes a personal 
computer. The enhancements due to the use of 
Web-based HMI in manufacturing can be summa- 
rized in the following groups : 



1. An ergonomic visualization in many variants 
(colored, high resolution screens and standard- 
ized visualization technologies enable an ap- 
pealing and effective representation of data 
from the shop floor and data, for example, from 
the ERP-System). 

2. Hardware and software advancements enable 
more efficient input- and data-manipulation 
processes. 

3. The contents and screen designs are easily 
updatable und changeable. 

4. The visualization is not bounded to the com- 
puter in the machine but connects via the 
Internet, which enables the delocalization of 
the interaction in various scenarios. 
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Web-based HMI changes and enhances several 
workflows, especially in manufacturing information 
processing (AWK, 1999). This triggers several con- 
sequences. 

CONSEQUENCES OF 
WEB-BASED HMI 

Through the standardized technologies used in Web- 
based HMI, all other forms of applications that build 
upon Internet technologies are distributable on the 
shop floor to every worker. The consequences affect 
the following fields of activity with extended HMI 
processes: 

1 . Collaboration on the shop floor 

2. Data collection 

3. Communication and coordination with the man- 
agement 

4. HMI processes themselves 

The availability of a full networked computer 
enables collaboration applications (e.g., workflow 
management systems, instant messaging, and voice- 
over Internet Protocol (IP) on the shop floor. Internet 
technologies enable the intuitive integration of those 
technologies and interfaces that support the worker 
in his or her special environment. For example, 
workers in the assembly may interact directly with 
engineers via the IP-based speech and video connec- 
tions to solve special problems or to learn specific 
work processes cooperatively. The aerospace indus- 
try uses such methods for the assembly of compli- 
cated parts of planes. Those intraorganizational vir- 
tual relationships help to reduce costs through Web- 
switched communication by reducing or, in the case 
of wireless techniques, by replacing the necessary 
cable lanes and a markedly eased setup of infrastruc- 
ture through the use of open standards. 

The Web-based interfaces enable an accompany- 
ing data collection through the integration of data- 
entry screens into the normal workflow screens of 
the interface. The mobilization of computers allows 
the worker to have a personalized pad, which enables 
mobile data collection in various applications (e.g., 
logistics or quality assurance). Those pads or devices 
have sufficient computation power and offer connec- 
tivity to use specialized equipment, such as Bluetooth- 



headsets for speech entry. The networked local 
computer may use a Web service on a remote 
server for speech recognition. 

The high resolution of screens and the integra- 
tion into an intranet on the shop floor push the setup 
of Web-based training on the job in manufactur- 
ing. Especially relearned, low-educated workers 
show good results, if they are trained in short 
lessons during their work hours (Schmidt/Stark, 
1996). Furthermore, an effect of the extended use 
of computers and the training on abilities to handle 
computers has positive effects on the diffusion of 
new information systems and the resulting pro- 
cesses (Rozell & Gardner, 1999). 

The extended Web-based interaction abilities 
also virtualize the communication and coordination 
with the management of the organization. Opera- 
tors have access to upper level information via Web 
browsers, and the top-down communication be- 
comes more intuitive, which directly simplifies the 
coordination structures (Eberts, 1997). 

Although the improved interfaces have wide- 
spread effects on information handling on the shop 
floor, the most important aspect remains the HMI 
itself. HMI has several similarities to human com- 
puter interaction in the office world, although there 
are important differences. These include the exten- 
sive application of touch-screen interaction and the 
feedback via the activities of the controlled ma- 
chine. The other interaction scenarios are compa- 
rable with common HCI scenarios. Indeed, there 
are differences in the work environment (industrial 
settings), the design of the computers (use of touch 
screens, no keyboards or mice) and the abilities of 
the workers (Fakun& Greenough, 2002). These 
differences require an analysis of Web-based HMI 
on the shop floor that differentiates from the results 
of common HCI. The Web-enabled facilities induce 
two contrary consequences: 

1 . Web-based information distribution leads to a 

more intuitive and efficient HMI, which de- 
creases the interaction complexity for the 
human. This induces reduced qualification 
requirements, because the handling of ma- 
chines requires less specialized knowledge. 
This leads to the hypothesis that there are 
lesser skills necessary for workers interacting 
with machines. 
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2. The diffusion of Internet technologies enables 
a networking of machines and information sys- 
tems, which demands the usage of those opti- 
mization possibilities for competitive improve- 
ments. This results in an increasing HMI com- 
plexity, because much more information is to 
be handled on the shop floor, which requires 
additional skills of the workers. Furthermore, 
the span of control of a single worker over 
different machines may increase. This leads to 
the hypothesis that additional skills of workers 
are necessary. 

Workers on the shop floor may not have the 
necessary skills for the extended screen-oriented 
information handling, although they are often spe- 
cialists and well-trained (Mikkelsen et al., 2002). 
Therefore, cooperation between the human resources 
and planning departments in manufacturing has to 
clarify whether the workers should receive ex- 
tended training or whether the screen and informa- 
tion design has to be adapted according to the user’s 
abilities. To evaluate the consequences in practical 
cases, we have to consider the resulting behavior 
that is necessary for fulfilling the tasks within manu- 
facturing. Those behaviors can be categorized as 
follows (Strahe, 1995): 



rule-based behavior into the forefront. Workers 
have only assisted the machines by inserting punch 
cards, which have been prepared by engineers. The 
diffusion of computerized, programmable control 
architectures enabled the direct influence of skilled 
workers again (Strahe, 1995) and is promoted through 
the upcoming Web-based HMI knowledge-based 
behavior on the shop floor. 

Compromising the technological advances in HMI 
has changed the machine control itself as well as the 
interaction with all actors on and off the shop floor. 
To benefit from those changes, a coordinated imple- 
mentation of the technology and organizational pro- 
cesses is required (Wu, 2002). 
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FUTURE TRENDS 

The realization of the potential of Web-based HMI 
requires an adequate implementation of technical 
and organizational structures. First, management 
has to assure whether a ubiquitous Web-based HMI 
infrastructure is desirable. The additional benefits of 
Web-based HMI are reasonable only if there is a 
demand for it (Stolovitch, 1999). Therefore, an 
implementation of the described technologies and 
organizational changes should be accomplished if: 



1. Skill-Based Behavior: Well-learned, sen- 
sory-motor behavior analogous to nearly in- 
stinctive hand and foot actions while driving a 
car. 

2. Rule-Based Behavior: Actions triggered by 
a certain pattern of stimuli. A computer using 
an if-then algorithm to initiate an appropriate 
response could execute these actions. 

3. Knowledge-Based Behavior: Responding 
to new situations. High-level situation assess- 
ment and evaluation, consideration of alterna- 
tive actions in light of various goals (making 
decisions and multifactor scheduling of ac- 
tions). 

HMI has shifted dramatically the possible behav- 
iors in operating machines. Skill-based behavior has 
dominated the pre-computerized HMI, where ma- 
chines only were usable based on the skills of 
workers. The rise of numeric control pushed the 



1 . Extended knowledge-based behaviors are re- 
quired; and 

2. Complex manufacturing tasks with extended 
information processing requirements on the 
shop floor are necessary. 

If those tasks are not necessary, an isolated 
application of Web-based HMI will bring forth some 
benefits on the existing work processes. However, 
the gain of the full potentials of Web-based HMI 
requires an integration of the various information 
systems on the shop floor, the implementation of 
adequate organizational structures, managerial pro- 
cesses, as well as education strategies for online 
training on the job. Those action fields induce bundled 
measures in many aspects of the factory and at the 
same time form the main future trends in Web-based 
HMI. We concentrate here on the issues concerning 
the interactions of humans and (Web-enabled) ma- 
chines. 
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Adaptation Information Systems and 
Machining Infrastructure 

Technological barriers are critical. Web-based HMI 
scenarios require adequate machinery that has em- 
bedded computation power. Moreover, there has to 
be a networking infrastructure. Barriers result from 
the existing infrastructure. Technology manage- 
ment has to ensure the implementation of Internet 
technologies within the production system. Facilities 
as well as information systems have to be strategi- 
cally equipped with Internet technologies (Blecker, 
2006). This also means that existing information 
systems may be extended to meet the new require- 
ments. 

Development of an Education Strategy 

Indirect communication over different Internet-based 
communication technologies requires employees to 
have sufficient knowledge in handling information 
technologies. They also should have a basic under- 
standing of how the omnipresent network operates; 
otherwise, they are likely to see it as a black box. 
This would lead to a passive use of the information 
network, where an active use really is required. 
Therefore, human resource management has to 
train employees to meet the requirements. The 
training also should reduce the resistance of employ- 
ees. The suggested mechanisms make the work 
environment more transparent. Indeed, this trans- 
parency has to be dealt with carefully, because it 
also allows the detailed reconstruction of the usage 
and the spying of the interaction behavior of the 
employees. 

Ergonomics and Motivation 

Yi and Hwang (2003) have shown that application- 
specific self-efficacy, enjoyment, and learning-goal 
orientation all determine the actual usage of a Web- 
based information system. Those aspects have to be 
considered during the setup of a Web-based human- 
machine infrastructure. Especially in the exposed 
areas on the shop floor, the design of the devices and 
the interaction possibilities beyond traditional HCI 
are important. Therefore, the distributed content has 
to be adopted for use on the shop floor, although the 
representation also has to satisfy the requirements 



of normal screen design, as is shown in Ozok and 
Salvendy (2004). The adoption should boil down the 
information to the most important messages. This 
can be assured using semantic technologies 
(Geroimenko & Chen, 2003) and the use of short 
abstracts and keywords. The input workflows should 
be implemented in wizard style, for example, so that 
scrolling and additional mouse-like movements on 
the screen can be omitted. 

Production Portals for 
Visual Representation 

To design enterprise-wide screen guidelines based 
on the information system integration, it is necessary 
to set up a strategy for visual integration of informa- 
tion systems as well as the machine control for the 
workers on the shop floor. Production portals are a 
solution to those challenges. A production portal is a 
digital enterprise portal that is used by a manufactur- 
ing organization or plan as a means to assist its 
decision-making activities (Huang & Mak, 2003). 
These portals are able to deliver adapted interfaces, 
for example, for experts or beginners with the help 
of dynamically generated pages based on Web 
technologies. Through the dynamic linking capabili- 
ties of Web technologies (e.g., the use of Web 
services for the delivery of information from enter- 
prise resource planning systems), the integration of 
all information sources into one screen design can be 
realized. Due to the characteristics of work on the 
shop floor, multitasking also is not a desired feature. 
An explorer-like tree (Botsch & Kunz, 2001) orga- 
nizes all of the personalized features that are rel- 
evant for the worker. In this case, workers do not 
have to work with different application windows but 
can navigate in one browser window between infor- 
mation sources and data entry forms through rela- 
tively simple links. 

CONCLUSION 

The evolution in human machine systems will be 
driven in the future by new information technologies. 
Management has to react to those changes by the 
application of the latest technological advancements 
in interface design. Special attention is to be further 
placed on input technologies such as augmented 
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reality or, for example, data gloves, which will be 
integrated into the human-machine system through 
Internet technologies (Roco & Bainbridge, 2003). 
Furthermore, human-centered aspects, such as cog- 
nitive models of workers, that are psychologically 
tested also have to be integrated into screen and/or 
input interface design. Web-based infrastructures 
enable the necessary flexibility and adaptability of 
interfaces. 
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KEY TERMS 

Embedded Devices: Full-featured computers 
that are integrated into machines. 

HMI in Manufacturing: Relation between a 
human operator and one or more machines via an 
interface for embracing the functions of machine 
handling, programming, simulation, maintenance, di- 
agnosis, and initialization. 

Industrial Ethernet: Ethernet technology that 
is adjusted to specific environmental conditions (e.g., 
regarding electromagnetic compatibility, shaking, 
moisture, and chemical resistance in manufactur- 
ing). 

Production Portal: The linking of all available 
information systems into one standardized screen. 
Production portals aggregate heterogeneous sys- 
tems in manufacturing and provide secure, struc- 
tured, and personalized information for individual 
users (e.g., based on job functions). 

Ubiquitous Computing: Trend to integrate in- 
formation and communication technologies into all 
devices. 



Voice-Over IP: Standard for making telephone 
calls via an Internet connection. It enables the 
flexible use of different input devices, including 
video telephone applications. 

Web-Based HMI: An advanced and extended 
form of computerized HMI characterized by the 
logical separation of the computer unit from the 
machine itself. 

Web Pad (or Handheld PC): Devices that are 
connected via wireless technologies to an intranet 
(WLAN, Bluetooth, GPRS/UMTS) and offer a full- 
featured operating system with a Web browser. 

Web Service: The term Web services describes 
a standardized way of integrating Web-based appli- 
cations using the XML, SOAP, WSDL, and UDDI 
open standards over an Internet protocol backbone. 
XML is used to tag the data, SOAP is used to 
transfer the data, WSDL is used to describe the 
services available, and UDDI is used to list what 
services are available. Used primarily as a means 
for businesses to communicate with each other and 
with clients, Web services allows organizations to 
communicate data without intimate knowledge of 
each other’s IT systems behind the firewall. 
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INTRODUCTION 

Information and communication technologies have 
played a fundamental role in teaching and learning 
for many years. Technologies, such as radio and TV, 
were used during the 50s and 60s for delivering 
instructional material in audio and/or video format. 
More recently, the spread of computer-based edu- 
cational systems has transformed the processes of 
teaching and learning (Squires, Conole, & Jacobs, 
2000). Potential benefits to learners include richer 
and more effective learning resources using multi- 
media and a more flexible pace of learning. In the 
last few years, the emergence of the Internet and the 
World Wide Web (WWW) have offered users a new 
instructional delivery system that connects learners 
with educational resources and has led to a tremen- 
dous growth in Web-based instruction. 

Web-based instruction (WBI) can be defined as 
using the WWW as the medium to deliver course 
material, manage a course (registrations, supervi- 
sion, etc.), and communicate with learners. A more 
elaborate definition is due to Khan (1997), who 
defines a Web-based instructional system (WIS) as 
“.. .a hypermedia-based instructional program which 
utilises the attributes and resources of the World 
Wide Web to create a meaningful learning environ- 
ment where learning is fostered and supported.” 
Relan and Gillani (1997) have also provided an 
alternative definition that incorporates pedagogical 
elements by considering WBI as “...the application 
of a repertoire of cognitively oriented instructional 
strategies within a constructivist and collaborative 
learning environment, utilising the attributes and 
resources of the World Wide Web.” 

Nowadays, WISs can take various forms de- 
pending on the aim they serve: 

• Distance-learning (DL) systems’ goal is pro- 
viding remote access to learning resources at a 
reduced cost. The concept of DL (Rowntree, 
1993) is based on: (i) learning alone, or in small 



groups, at the learner’s pace and in their own 
time and place, and (ii) providing active learn- 
ing rather than passive with less frequent help 
from a teacher. 

• Web-based systems, such as intelligent tutor- 
ing systems (Wenger, 1987), educational 
hypermedia, games and simulators (Granlund, 
Berglund, & Eriksson, 2000), aim at improving 
the learning experience by offering a high level 
of interactivity and exploratory activities, but 
require a significant amount of time for devel- 
opment. The inherent interactivity of this ap- 
proach leads learners to analyse material at a 
deeper conceptual level than would normally 
follow from just studying the theory and gener- 
ates frequently cognitive conflicts that help 
learners to discover their possible misunder- 
standings and reconstruct their own cognitive 
models of the task under consideration. 

• Electronic books provide a convenient way to 
structure learning materials and reach a large 
market (Eklund & Brusilovsky, 1999). 

• Providers of training aim to offer innovative 
educational services to organisations for work- 
place training and learning, such as to supple- 
ment and support training in advance of live 
training, update employee skills, develop new 
skills. 

The main difference between WBI and the tra- 
ditional computer-based instructional programs lies 
in the way information is presented to the user. The 
WISs’ approach to e-learning does not only provide 
“active learning,” which according to Bates (1991) 
is the most effective way to learn, but also 
interactivity, which is a well-known facilitator of the 
learning experience (Mason & Kaye, 1989). Thus, 
we have, on the one hand, traditional instructional 
programs which present educational content in a 
linear fashion using a static structure, and on the 
other hand, WISs that exploit the hypermedia capa- 
bilities, for example, offering flexibility in the deliv- 
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ery of instruction through the use of hyperlinks 
(Federico, 1999). As a consequence, WBI has led to 
a new model for teaching and learning that focuses 
on the learner not as passive recipient of knowledge 
but as an active, self-directed participant in the 
learning process. Nevertheless, this approach to 
instruction has also created a series of challenges 
that users of educational technology, such as teach- 
ers, learners, providers of educational content, edu- 
cational institutions and so forth, have to meet: (i) 
ensure the improvement of learning experience, as 
usual technology-driven innovations consume prodi- 
gious amounts of time and money to little educational 
effect; (ii) bring a real and substantial change in 
education by improving their understanding of learn- 
ing and teaching with the use of this new technology. 

This article presents the main features of Web- 
based instructional systems, including their advan- 
tages and disadvantages. It discusses critical factors 
that influence the success and effectiveness of 
WISs. It stresses the importance of pedagogy on 
WBI and explores the pedagogical dimensions of the 
interface tools and functionalities of WISs. Lastly, it 
summarises future trends in Web-based instruction. 



BACKGROUND 

The appeal of WISs lies in their ability to actively 
engage learners in the acquisition and use of infor- 
mation, support multiple different instructional uses 
(tutoring, exploration, collaboration, etc.), support 
different learning styles and promote the acquisition 
of different representations that underlie expert- 
level reasoning in complex, ill-structured domains 
(Selker, 1994). Learners select the knowledge they 
perceive as being most suited to their needs. But, 
although the act of browsing is a pleasing experi- 
ence, browsing in an unknown domain is not likely to 
lead to satisfactory knowledge acquisition at all 
(Jonassen, Mayes, & McAleese, 1993). Thus, navi- 
gational aids, such as a pre-defined hierarchical 
structure of the subject matter, are necessary espe- 
cially in large domains. The pre-defined structure of 
the domain knowledge provides learners (especially 
novices) with guidance during their study, offering 
them a sense of safety and a reliable navigation path. 
In this way, learners are supported in constructing 
their own individualised model of the knowledge 



space and are able to follow paths through the 
subject content produced by designers, or to develop 
their own routes according to individually-prescribed 
requirements (Large, 1996). 

Another attractive element is the flexibility to 
access course contents through intranets and the 
Internet at any time and from different places, which 
is considered as the main reason many educators 
have tried to develop distance learning programs on 
the WWW. This flexibility creates many opportuni- 
ties for exploration, discovery, exchange/sharing of 
information and learning according to learners’ indi- 
vidual needs. Flexibility, however, comes at a price: 

• The complexity of the system may increase 
(Ellis & Kurniawan, 2000). Users may need 
more time to search for the information (Ng & 
Gunstone, 2002), and the dynamism and rich- 
ness of the content may negatively affect learn- 
ers’ level of comprehension (Power & Roth, 
1999). 

• Despite the plethora of communication tools, 
learners sometimes find feedback insufficient, 
feel isolated or not supported enough, and drop 
out of the course (Quintana, 1996). 

• It is unlikely all learners are equally able to 
performing their own sequencing, pacing, and 
navigation. Moreover, the learner is not always 
going to choose the content to study next in a 
way that will lead to effective learning 
(Hammond, 1992; Leuthold, 1999). 

• Previous knowledge of the domain content 
varies for different learners, and indeed knowl- 
edge may grow differently through the interac- 
tion with the system (Winkels, 1992). 

• Learners tend to get lost, especially when the 
educational content is large and/or when they 
are novices. This can lead to disorientation 
experienced when users do not know where 
they are within hypertext documents and how 
to move towards the desired location, com- 
monly known as “lost in hyperspace” 
( B ru silo v sky ,2001). 

• Learners may fail to get an overview of how all 
the information fits together when browsing. In 
the absence of information that might help 
them formulate knowledge goals and find rel- 
evant information, learners may stumble through 
the content in a disorganised and instructionally 
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inefficient manner (Hammond, 1992). Further- 
more, if learners are too accustomed to 
memorising and are faced with multiple expla- 
nations of the same knowledge, they may at- 
tempt to memorise them all. This is one of the 
aspects of a problem known as “ information 
overload” which is usually experienced by 
users of WISs (McCormack & Jones, 1998). 



CRITICAL ISSUES FOR DESIGNING 
AND DELIVERING WEB-BASED 
INSTRUCTION 

The Role of the Users 

In WBI, the roles of teachers and learners are 
different from their classic definitions. Thus, teach- 
ers design educational content that is attractive to 
learners in order to motivate them, interact with the 
learners, and act as facilitators of the learning 
process. Learners are mainly responsible for their 
own learning, assessment of knowledge goals and 
objectives. As a consequence, learners need to be 
able to form their own ideas about the content and 
understand the educational material in their own 
way. That change of roles requires course broaden- 
ing of skills and competencies for teachers and 
learners. Table 1 highlights the differences in users’ 
roles and the impact of WBI. 

The Pedagogy of 
Web-Based Instruction 

The need for changing instructional methods has 
come partly in response to demands of the workplace 



and partly because of re-assessment of instruc- 
tional methodologies. Individuals are now expected 
to be adaptable to modern ways of communication, 
such as e-mail systems, the Internet, intranets, the 
WWW, conferencing systems. They are also ex- 
pected to apply high cognitive skills, such as analysing, 
summarising, and synthesising information as well 
as engaging in creative and critical thinking (Vogel 
& Klassen, 2001). In principle, WISs can serve this 
purpose but the greatest benefits of their use can 
occur via a pedagogic approach that most effec- 
tively uses the characteristics of this technology to 
increase the quality of the learning experience as 
already explained earlier. 

As a result, a number of educational trends 
emerged in recent years have played a particularly 
important role in Web-base instruction; three of 
these are presented in the following: 



w 



• Individualised Learning: This approach pro- 
vides learners the capability to select the mode 
of delivery and timing of module material. For 
example, learners can choose a blended way 
for learning which consists of lectures, partici- 
pation in traditional face-to-face communica- 
tion in a classroom, and collaborative work in 
a remote environment on the WWW. 

• Constructivist Theory: The constructivist 
perspective describes learning as change in 
meaning constructed from experience (Newby, 
Stepich, Lehman, & Russell, 1996). 
Constructivism covers a wide diversity of 
perspectives that consider learning as an ac- 
tive process of constructing rather than ac- 
quiring knowledge and instruction as a pro- 
cess of supporting that construction rather 



Table 1. Users, roles, and Web-based instruction 



Teachers’ role 

Instructor Facilitator 



Learners’ role 



Identification of 
learning outcomes; 
structuring and 
sequencing of 
domain knowledge; 
designing 

educational activities 
and assessments 



Response to questions; 
providing consistent and 
timely feedback; 
encouraging discussion 
among learners; 
motivating learners and 
reinforcing effective 
study habits 



Study the educational 
content; undertake 
responsibility for their 
learning; adopt new 
forms of 

communication and 
new ways of learning 



WBI 



Interactive tools; 
information sharing 
and communication 
mechanisms; 
individualised 
assessment; distributed 
educational resources 
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than communicating knowledge (Duffy & 
Cunningham, 1996). 

• Experiential Learning: According to Kolb 
(1984), experiential learning involves the fol- 
lowing steps: concrete experience; observa- 
tion and reflection; formulation of abstract 
concepts and generalisations; testing of the 
implications of the concepts in new situations. 
Experiential learning can take different forms: 
learning by doing (Graf & Kellogg, 1990); 
experience-based learning, trial and error 
and applied experiential learning (Gentry, 
1990); reflection in action (Senge, 1995); 
action learning (Pedler, 1997). But experi- 
ence must be accompanied by reflection, as 
experience alone does not automatically lead to 
learning. This is important for both teachers 
and students. 

The WWW and especially hypermedia provide 
an eminently suitable environment for the develop- 
ment of educational systems that adopt these in- 
structional models; that is, educational hypermedia 
are considered as excellent representations of 
constructivist approaches in theory (Jonassen et al., 
1993). To set up a WIS to facilitate these forms of 



learning, one should ensure it contains the set of 
elements such as attraction of attention, recall of 
prior knowledge, consistent presentation style and 
structure, group work or individual tasks, self-as- 
sessment questions, practice/exercises, feedback, 
review, learning guidance, post knowledge. The use 
of the WWW adds extra dimensions to teaching and 
learning. But for learning to take place, the learner 
has to be not only active but also engaged in the 
learning process. Table 2 provides a, example, mak- 
ing a link between learner’s involvement and ac- 
quired skills (following Bloom’ s (1956) taxonomy of 
intellectual behaviour) with the types of educational 
content in a WIS. 

Planning, designing, and implementing WBI in- 
cludes several dimensions, which of course contrib- 
ute to the effectiveness of this approach. Among a 
number of factors, the user interface of the educa- 
tional system, the communication facilities offered 
and the educational content are of particular im- 
portance. Table 3 gives an overview of pedagogical 
considerations for designing a WIS. 

The considerations for the components in Table 
3 show that WBI strives to create environments that 
favour a constructivist model of learning that allows: 
learners actively construct, transform and extend 



Table 2. Pedagogical aspects of Web-based instruction 



Skills/abilities 


Learner’s involvement 


Type of content 


Knowledge: recall 
studied content 


Memorisation of knowledge (from specific facts 
to complete theories). 


Hypertext and images 


Comprehension: grasp 
the meaning of the 
content 


Interpreting, explaining or summarising the 
material; estimating future trends (predicting 
consequences or effects). Taking up tests about 
knowledge of facts, theories, procedures, etc. 


Hypertext and images, 

self-assessment 

questions 


Application: apply 
learned material to new 
and concrete situations 


Applying principles, concepts, laws, and 
theories. 


Examples, self- 
assessment questions 


Analysis: break down 
material into 
components and 
understand the 
organisational structure 
of the content 


Identification of components, analysis of their 
interrelationships, and recognition of the 
organisational principles involved. An 
understanding of the content and the structural 
form of the material is required. 


Examples, self- 
assessment questions 




Production of a unique communication, a work 


Interactive tools, 
simulations, case 
studies, self- 
assessment questions 


Synthesis', put parts of 
material together to 
form a new whole 


plan or set of abstract relations between 
concepts. Develop creative behaviours with 
major emphasis on the formulation of new 
patterns or structure. 


Evaluation: judge the 
value of the content for 
a given purpose/task 


Making conscious judgements based on clearly 
defined criteria or goals. 


Interactive tools, 
simulations, case 
studies, self- 
assessment questions 
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Table 3. Pedagogical dimension of system ’s components in Web-based instruction 



Component 


Pedagogical Role 




- Reduce learner’s anxiety : consistent and easy-to-use. 


User interface 


- Support learners and teachers in tasks completion : provide tools based on 
users’ profile. 


Communication 

facilities 


- Enhance cognitive skills : help formulate ideas, elaborate on the subject matter. 

- Support collaboration and interaction : among learners themselves and/or 


between learners and educators. 




- Main source of information : use of a user-friendly language, accessible, easily 
understandable. 


Educational 

content 


- Support different learning styles', include types of content, various levels of 

difficulty. 

- Emphasise exploration : adopt a hypermedia form of presentation, provide 

different types of resources, simulations, learning by discovery. 

- Enhance social skills: include group work, projects. 

- Evaluate knowledge : self-assessment questions, projects, various types of 




assessment. 
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their knowledge; active engagement in the interpre- 
tation of the content and reflection on their interpre- 
tations; linking educational content with real-world 
situations (Jonassen, 1994). Thus, through explora- 
tion of educational material, which addresses differ- 
ent knowledge levels, learning objectives, and learn- 
ing styles, learners take the responsibility of their 
learning. 

Individual Differences 

Learners differ in traits such as skills, aptitudes and 
preferences for processing information, construct- 
ing meaning from information, and applying it to real- 
world situations. Recent approaches to WBI try to 
take into account various dimensions of individual 
differences, such as the level of knowledge or 
literacy, gender, culture, spatial abilities, cognitive 
styles, learning styles, accessibility issues for the 
disabled and elderly. To this end, learner-centered 
approaches, which have been motivated by socio- 
cultural and constructivist theories of learning 
(Soloway et al. ,1996), have been proposed. Learner- 
centered design acknowledges that understanding 
of learners needs is of primary importance to provide 
effective WBI to heterogeneous student populations 
(Soloway, Guzdial, & Hay, 1994; Quintana, Krajcik, 
& Soloway, 2000). 

The impacts of individual differences on WBI 
have been investigated along different dimensions: 



• Cognitive and learning styles that refer to a 
user’s information processing habits have an 
impact on user’s skills and abilities, such as 
preferred modes of perceiving, thinking, re- 
membering, and problem solving (Ford & Chen, 
2000, 2001; Shih & Gamon, 2002). 

• Gender differences affect WBI in the sense 
that males and females have different require- 
ments with respect to navigation support and 
interface features. The preferences of males 
and females also differentiate remarkably in 
terms of information seeking strategies, media 
preferences, and learning performance 
(Campbell, 2000; Large, Beheshti, et al., 2002; 
Leong & Hawamdeh, 1999; Liu, 2003). 

• Prior knowledge and system experience 
affect learners’ interactions with the WIS and 
their level of knowledge of the educational 
content. The impact of this individual differ- 
ences’ dimension depends on learners’ previ- 
ous understanding of the educational content, 
that is, because of relevant studies, and their 
familiarity with the WIS’s features and 
functionalities, that is, familiarity with distance- 
learning systems (Reed & Oughton, 1997 ; Law- 
less & Kulikowich, 1998; Last, O’Donnell, & 
Kelly, 2001). 

The empirical evaluation of the effects of indi- 
vidual differences on the degree of success or 
failure experienced by learners needs to be explored 
in more detail to fully understand their impact on the 
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quality of learning attained within WISs (Magoulas, 
Papanikolaou, & Grigoriadou, 2003). Actually, the 
main problems in exploiting such information in a 
WIS is to determine which characteristics should be 
used (are worth modelling) and how (what can be done 
differently for learners with different preferences or 
styles) (Brusilovsky, 2001). In the next section, this 
problem is addressed in the context of personalisation 
technologies, which are considered a promising ap- 
proach to accommodate individual differences. 

FUTURE TRENDS 

Personalised learning environments (PLEs) have 
instantiated a relatively recent area of research that 
aims at alleviating the information overload and lost 
in hyperspace problems by integrating two distinct 
technologies in WBI: intelligent tutoring systems 
(ITS) and educational hypermedia systems. This is 
in effect a combination of two approaches to WISs: 
the more directive tutor-centred style of traditional 
tutoring systems and the flexible learner-centred 
browsing approach of educational hypermedia sys- 
tems (Brusilovsky, 2001). 



PLEs adapt the content, structure, and/or pre- 
sentation to each individual user’s characteristics, 
usage behaviour, and/or usage environment. 
Personalisation usually takes place at three different 
levels: content level, presentation level, and naviga- 
tion level. For example, in a system with 
personalisation at the content level, the educational 
content is generated or assembled from various 
pieces depending on the user. Thus, advanced learn- 
ers may receive more detailed and deep information, 
while novices will be provided with additional expla- 
nation. At the presentation level, adaptive text and 
adaptive layout are two widely used techniques. 
Adaptive text implies that the same Web page is 
assembled from different texts following learner’s 
current need, such as removing some information 
from a piece of text or inserting extra information to 
suit the current user. Adaptive layout aims to differ- 
entiate levels of the subject content by changing the 
layout of the page, instead of the text, such as font 
type and size, and background colour. At the naviga- 
tion level, the most popular techniques include direct 
guidance, adaptive ordering, link hiding, and link 
annotation. 



Table 4. Web-based instructional systems that employ personalisation features (adapted from 
Magoulas et al., 2003) 



System and Subject domain 


Individual Differences 
Dimension 


Level of Personalisation 


Pedagogical Approach 


CS383 (Carver etal., 1996) 
Computer Systems 


Learning style 


Presentation 


Media selection based on 
learners’ learning style 


AST (Specht et al., 1997) 
Introductory Statistics 


Knowledge level; Learning 
style; User preferences 


Content; 

Navigation 


Multiple teaching strategies 


ELM- ART II (Weber & Specht, 1997) 
Programming in Lisp 


Knowledge level; User 
preferences 


Content; 

Navigation 


Example-based programming 


DCG (Vassileva, 1997, 1998) 
Domain Independent 


Knowledge level; Learning 
goal; User preferences 


Content 


Generic Task Model Theory 


INTERBOOK (Brusilovsky et al., 1998) 
Domain Independent 


Knowledge level 


Content; 

Navigation 


N/A 


KBS-HYPERBOOK (Henze et al., 
1999) 

Introduction to Programming using 
Java 


Knowledge level; Learning 
goals 


Navigation 


Project-based learning 


ARTHUR (Gilbert & Han, 1999) 
Computer Science Programming 


Learning style 


Content 


Multiple instructional styles: 
visual-interactive, auditory-text, 
auditory-lecture, text style 


INSPIRE (Papanikolaou et al., 2003) 
Computer Architecture 


Knowledge level; Learning 
style 


Presentation; 

Content; 

Navigation 


Component Display Theory; 
Elaboration Theory 


AES-CS (Triantafillou, Pomportsis, & 
Demetriadis, 2003) 

Multimedia Technology Systems 


Knowledge level, Cognitive 
style 


Presentation; 

Content; 

Navigation 


Multiple instructional strategies 
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Table 4 (adapted from Magoulas et al., 2003) 
presents the features of several PLEs with respect 
to: the individual student characteristics used to 
guide the personalisation (see “Individual Differ- 
ences” column), the type of personalisation provided 
(see “Level of Personalisation” column), and the 
teaching/learning approach or theory (see “Peda- 
gogical Approach” column). 

Several approaches to evaluate the performance 
of PLEs have been proposed in the literature, and 
the empirical results look really promising 
(Weibelzahl, Lippitsch, & Weber, 2002). However, 
many questions are still open in this context. Among 
the most critical ones are questions related to the 
level of tutor and learner control over the PLE, the 
development of appropriate methods of assessing 
information about the behaviour of the learner in 
the course of learner-system interaction, and the 
systematic evaluation of the effectiveness of 
personalisation. 



CONCLUSION 

Advances in technology are increasingly impacting 
the way in which the curriculum is delivered and 
assessed. The ever-increasing learner needs make 
particularly important for Web services to provide 
learning tools. The attraction of WISs lies in their 
capability to actively engage the learner in the 
learning process, support multiple instructional uses 
(tutoring, exploration, research, etc.) and different 
learning styles, provide feedback mechanisms and 
promote the acquisition of various skills. There are 
of course some critical factors that influence the use 
of WISs in an educational setting. This article cov- 
ered issues related to teachers’ and learners’ new 
roles, learner-centered design and pedagogical con- 
siderations, which in our opinion are the most impor- 
tant ones to fully exploit the benefits of WBI in 
education. 
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Computer-Assisted Instruction: The use of 

computers in educational settings, that is, tutorials, 
simulations, exercises. It usually refers either to 
stand-alone computer learning activities or to activi- 
ties which reinforce educational material introduced 
and taught by teachers. 

Constructivism: Teaching model that consid- 
ers learning as the active process of constructing 
knowledge, and instruction as the process of sup- 
porting that construction. 

Educational Hypermedia: Web-based learn- 
ing environments that offer learners browsing through 
the educational content supported by flexible user 
interfaces and communication abilities. 

Educational Technology: The use of technol- 
ogy to enhance individual learning and to achieve 
widespread education. 

Individual Differences: In the context of Web- 
based instruction, this term is usually used to denote 
a number of important human factors, such as 
gender differences, learning styles, attitudes, abili- 
ties, personality factors, cultural backgrounds, prior 
knowledge, knowledge level, aptitudes and prefer- 
ences for processing information, constructing mean- 
ing from information, and applying it to real-world 
situations. 

Information Overload: Learners face the in- 
formation overload problem when acquiring increas- 
ing amounts of information from a hypermedia sys- 
tem. It causes learners frustration with the technol- 
ogy and anxiety that inhibits the creative aspects of 
the learning experience. 

Instructional/Pedagogical Design/Approach: 

In the context of Web-based instruction, this usually 
relates to pedagogical decision-making, which con- 
cerns two different aspects of the system design: 
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planning the educational content (what concepts 
should be the focus of the course) and planning the 
delivery of instruction (how to present these con- 
cepts). 

Lost in Hyperspace: This is a feeling experi- 
enced by learners when losing any sense of location 
and direction in the hyperspace. It is also called 
disorientation and is caused by badly-designed 



systems that do not provide users with navigation 
tools, signposting, or any information about their 
structure. 

Web-Based Instruction: Can be defined as 
using the Web as the medium to deliver course 
material, administer a course (registrations, supervi- 
sion, etc.), and communicate with learners. 
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